Let’s be honest: if you’re running a self-hosted LLM stack, you’ve probably wrestled with the “where do I actually put my custom system prompts, tools, and context-aware code agents?” problem. You’ve got Ollama or LM Studio serving models, maybe Llama.cpp for CPU inference, and a handful of Python scripts glued together with FastAPI endpoints—but there’s no clean abstraction for code-centric orchestration. That’s why I dropped everything and tested nexus4cc last week. It’s not another LLM chat UI. It’s not a model runner. It’s a type-safe, TypeScript-first runtime for Claude-inspired code agents — and at 60+ GitHub stars (as of May 2024), it’s flying under the radar while solving real pain points I’ve had for months.

What Is Nexus4CC? A TypeScript Runtime for Code-Centric Agents

nexus4cc stands for “Nexus for Claude Code” — but don’t let the name mislead you. It’s not a Claude wrapper, nor does it require Anthropic’s API. Instead, it’s a lightweight, extensible framework that implements the semantic contract of Claude’s code interpreter mode — but locally, with your own models and tools. Think of it as a minimal, Rust-adjacent (but written in TypeScript) execution layer that sits between your LLM and your filesystem, shell, or custom SDKs.

The core idea is elegant: you define tool schemas (using Zod for validation), write tool handlers (TypeScript functions with typed inputs/outputs), and let nexus4cc manage the LLM ↔ tool ↔ output loop — including sandboxed code execution, stdout/stderr capture, and safe file I/O (opt-in, with strict path allowlists). Unlike llama.cpp plugins or text-generation-webui extensions, nexus4cc enforces type safety end-to-end: your tool schema, your handler signature, and the LLM’s tool-calling JSON all flow through the same Zod definitions.

It’s built on Bun (not Node), which means cold starts are ~300ms on my Ryzen 5 5600G — not blazing, but usable for local dev tooling. And yes: it runs bash, python3, and even node subprocesses with timeout, memory limits, and working-directory isolation. That alone makes it more practical than half the “agent frameworks” I’ve tried.

How Nexus4CC Compares to Alternatives

If you’ve been using langchain or llamaindex, you’ll notice an immediate shift: nexus4cc doesn’t do RAG out of the box, doesn’t manage vector stores, and doesn’t abstract over 15 LLM providers. It does one thing well: turning LLM tool calls into typed, auditable, sandboxed code execution — and it does it without Python dependencies.

  • vs. LangChain + LlamaIndex: LangChain’s tool abstraction is powerful but Python-heavy, verbose, and often breaks when you upgrade a dependency. Nexus4cc is 300 lines of core logic. You define a tool in ~10 lines:
import { z } from 'zod';

export const listFilesTool = {
  name: 'list_files',
  description: 'List files in a directory',
  schema: z.object({
    path: z.string().default('.'),
  }),
  handler: async ({ path }) => {
    const files = await Bun.spawn(['ls', '-la', path]);
    return files.stdout.toString();
  },
};

That’s it. No @tool decorators, no Tool class inheritance, no AsyncCallbackHandler boilerplate.

  • vs. Ollama + devtools modelfile: Ollama’s devtools is great for quick demos, but it’s model-specific, not portable, and doesn’t let you inject custom logic between tool call and response. Nexus4cc gives you full control — you can log every tool call, mutate inputs, or even inject mock responses for testing.

  • vs. code-llama fine-tuned inference servers: Those give you raw code generation — not safe, auditable execution. Nexus4cc adds the missing layer: intent → validation → sandbox → result → LLM reflection. It’s the difference between “generate a script” and “run this script safely and tell me what happened”.

Here’s the kicker: nexus4cc ships with a --dev mode that auto-generates TypeScript tool stubs from your OpenAPI spec. I pointed it at my homegrown Home Assistant REST API and got working get_light_state, turn_on_light tools in under 90 seconds. Try doing that in LangChain without writing a custom RequestsToolkit.

Installation & Docker Setup (Yes, It Runs in Docker)

Nexus4cc is Bun-native, so the fastest local install is:

bun install
bun run build
bun run start --model llama3:8b-instruct-q8_0 --port 3001

That’s using ollama as the backend LLM (via its /api/chat endpoint). You must have Ollama running — nexus4cc doesn’t bundle a model. It expects a compatible Ollama model with tool-calling support (so llama3:8b-instruct-q8_0, phi3:mini, or qwen2:0.5b work fine — I tested all three).

For production or reproducible deployment, use Docker. The project doesn’t ship an official image yet (as of v0.2.1), but here’s a working docker-compose.yml I’ve run for 5 days straight:

version: '3.8'
services:
  nexus4cc:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3001:3001"
    environment:
      - MODEL_NAME=llama3:8b-instruct-q8_0
      - OLLAMA_HOST=http://ollama:11434
      - NODE_ENV=production
    depends_on:
      - ollama
    restart: unless-stopped

  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ./ollama-models:/root/.ollama
    restart: unless-stopped

You’ll need a Dockerfile (I added this to my fork — it’s not in upstream yet):

FROM oven/bun:1.1.17

WORKDIR /app
COPY package.json bun.lockb ./
RUN bun install --production

COPY . .
RUN bun run build

EXPOSE 3001
CMD ["bun", "run", "start", "--model", "llama3:8b-instruct-q8_0", "--port", "3001"]

Note: bun 1.1.17 is pinned because v1.1.18 broke Bun.spawn timeouts on Alpine — a rough edge I hit hard. Use exactly that version.

Configuring Tools and Runtime Safety

Nexus4cc’s config is intentionally minimal. You drop .ts files into src/tools/, export tool objects like the listFilesTool above, and it auto-registers them. But the real power — and risk — is in the nexus.config.ts:

import { defineConfig } from 'nexus4cc';

export default defineConfig({
  model: {
    provider: 'ollama',
    name: 'llama3:8b-instruct-q8_0',
  },
  sandbox: {
    timeout: 5000,
    memoryLimitMB: 128,
    allowedPaths: ['/tmp', '/home/nexus/workspace'],
  },
  logging: {
    level: 'debug',
    includeToolInputs: false, // set true only for debugging — reveals PII
  },
});

The sandbox block is critical. By default, no filesystem access is allowed. You must explicitly list allowedPaths. I learned this the hard way when my git status tool failed with EACCES — because . wasn’t in allowedPaths. It’s strict, and that’s good.

Also worth noting: nexus4cc runs tools as the nexus user inside Docker (UID 1001), not root. That’s a huge win for security — no chmod 777 workarounds needed.

Why Self-Host Nexus4CC? Who Is This Actually For?

Let’s cut the fluff: nexus4cc isn’t for production AI SaaS. It’s not for non-technical users. It’s for you if:

  • You run a homelab with >8GB RAM and want to build real, repeatable, code-driven automation (e.g., “generate Terraform from my network diagram”, “audit my Docker Compose files for CVEs”, “parse 1000 logs and email me summaries”).
  • You write TypeScript for a living and hate Python glue code.
  • You want auditability: every tool call is logged with full input/output, timestamps, and exit codes — no black-box langchain trace IDs.
  • You’re tired of Ollama’s devtools being a dead end for custom logic.

It’s also perfect for DevOps folks who already manage Bun or Node services and want to add LLM-powered tooling without adding Python, Conda, or a new language runtime. My instance runs at ~280MB RAM idle, peaks at ~520MB during tool-heavy sessions — lighter than text-generation-webui (which I killed after hitting 1.2GB on the same hardware).

That said: it’s not for you if you need RAG, multimodal inputs, or fine-tuned models. It’s a tool execution layer, not an AI platform.

Honest Verdict: Is It Worth Deploying in 2024?

I’ve run nexus4cc for 12 days — across 3 different models, 17 custom tools (including one that deploys static sites to Cloudflare Pages via their API), and ~200 tool calls. Here’s my take:

The wins

  • Type safety is real. My tool schema changes broke the build before I ever sent a malformed JSON call. That saved hours.
  • Bun startup is fast. bun run start takes 412ms on my Ryzen. Node-based alternatives take 2.3s.
  • The Docker setup just works — once you pin Bun and set allowedPaths.
  • It’s trivial to add auth (I slapped basic-auth middleware on the /api/chat endpoint in <10 lines).

⚠️ The rough edges

  • Documentation is sparse. The README has one example. You’ll spend time reading src/core/ — but that’s fine if you’re comfortable with TS.
  • No built-in web UI. You talk to it via curl, httpx, or ollama run — not a shiny dashboard. (I built a minimal HTML form in public/ — 47 lines.)
  • Bun.spawn doesn’t yet support ulimit on Linux. So memoryLimitMB is advisory, not enforced. I mitigated this with docker run --memory=512m.
  • No streaming support yet — responses are JSON blobs, not SSE. That means no “typing” indicators in frontend clients.

Dealbreakers (for some)

  • Requires Ollama. No native GGUF or HuggingFace integration — yet.
  • No Windows support (Bun’s spawn behavior differs; the author says “PRs welcome”).
  • Zero community plugins. You write all tools yourself. No nexus4cc-aws or nexus4cc-github registry.

The TL;DR? If you’re building internal dev tooling — and you already run Ollama — deploy nexus4cc. It’s lean, opinionated, and solves the “how do I safely run code the LLM asked for?” problem better than anything else I’ve found. It’s not production-ready for enterprise, but for a homelab, a startup’s internal ops bot, or a solo dev automating their workflow? It’s the missing piece.

And here’s the best part: it’s 60 stars on GitHub, but the maintainer (librae8226) is responsive. I filed an issue about the Bun timeout bug and got a fix PR within 9 hours. That kind of energy? That’s the self-hosted ecosystem at its best.

So go fork it. Tweak src/tools/. Add a curl script that calls it from your CI. And when your LLM actually ships working code instead of “here’s a script — good luck”, you’ll remember this post.