Let’s cut the hype: you’re drowning in alerts, juggling 137 devices across three cloud regions and on-prem racks, and your “ChatOps” still means copy-pasting curl commands into Slack while praying the jq filter doesn’t eat your JSON. Enter agentic-chatops — a real, working, shell-first implementation of all 21 Agentic Design Patterns (yes, the full taxonomy from the 2024 “Agentic Design Patterns” paper) — built not in LangChain or LlamaIndex, but in n8n + GPT-4o + Claude Code, glued together with 1,842 lines of lean, annotated Shell.
It’s not a PoC. It’s not a demo repo with one echo "hello". It’s what happens when a solo SRE (Kyriakos P., who actually runs 137 devices day-to-day) gets fed up with brittle Python microservices and builds a 3-tier agentic pipeline that acts, observes, reflects, and replans — all triggered from Slack or Mattermost. And yes — it has 93 GitHub stars (as of May 2024) and zero marketing site, zero Patreon, zero “Join our Discord” banner.
Here’s why you should care right now: this is the first ChatOps system I’ve seen that treats LLMs like operators, not oracles. It doesn’t just answer questions — it executes remediation, validates outcomes, rolls back on failure, and logs its own reasoning trace. And it runs on 4GB RAM.
What Is Agentic ChatOps — and Why “3-Tier” Matters
“Agentic ChatOps” isn’t just ChatOps + LLMs slapped together. It’s a deliberate architecture where the LLM isn’t the brain — it’s one agent in a coordinated stack. The agentic-chatops repo breaks this into three explicit tiers:
- Tier 1 — Orchestration (n8n): HTTP/webhook triggers, stateful workflow execution, credential management, retry logic, Slack/Mattermost parsing, and agent routing. n8n handles auth, timeouts, and concurrency — not Python scripts.
- Tier 2 — Reasoning (GPT-4o + Claude Code): Two LLMs, specialized. GPT-4o handles high-level intent parsing, plan decomposition, and natural language summarization. Claude Code (Sonnet 3.5, not Haiku) handles code generation, diff analysis, config validation, and Bash/Ansible/YAML linting. They don’t talk to each other — n8n feeds context between them with strict schema boundaries.
- Tier 3 — Execution (Shell + SSH + REST): All real work happens in POSIX-compliant Shell —
ssh -o ConnectTimeout=3,curl -sSf,jq -e '.status == "ok"',diff -u <(cat old.conf) <(cat new.conf). No Python subprocess spawning. Noeval. Justset -euo pipefail,casestatements, and 100% traceable exit codes.
That “3-tier” separation is the killer feature. Unlike llmops-chatops (abandoned, 2022, 12 stars), or chatops-llm (Python-heavy, requires pip install -r requirements.txt --force-reinstall every Tuesday), agentic-chatops isolates volatility. If Claude Code hallucinates a bad sed command, n8n catches the non-zero exit and triggers rollback — without touching GPT-4o’s plan or the SSH host.
How to Install and Run It (No Kubernetes Required)
You don’t need a K8s cluster. You don’t even need Docker — but Docker Compose is the blessed path. I deployed this on a t3.xlarge (4 vCPU / 16 GiB) EC2 instance running Ubuntu 22.04, but it runs fine on a Raspberry Pi 5 (8GB) for lab-scale testing.
Prerequisites
- Docker 24.0.7+ and Docker Compose v2.23.0+
openssl,jq,curl,ssh,rsync(all standard on Ubuntu/Debian)- API keys: OpenAI (
OPENAI_API_KEY), Anthropic (ANTHROPIC_API_KEY) - A Slack app with
chat:write,commands, andincoming-webhookscopes — or Mattermost with webhook URL
docker-compose.yml (minimal)
version: '3.8'
services:
n8n:
image: n8nio/n8n:1.52.0
restart: unless-stopped
ports:
- "5678:5678"
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=chatops
- N8N_BASIC_AUTH_PASSWORD=supersecret123
- NODE_ENV=production
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- N8N_WEBHOOK_TUNNEL_URL=https://your-domain.com
volumes:
- ./n8n-data:/home/node/.n8n
- ./agents:/opt/agents # ← your Shell agent scripts go here
networks:
- chatops-net
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./certs:/etc/nginx/certs
depends_on:
- n8n
networks:
- chatops-net
networks:
chatops-net:
driver: bridge
Then launch:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
docker compose up -d
Your First Agent: reboot-device.sh
Drop this into ./agents/reboot-device.sh:
#!/bin/sh
set -euo pipefail
DEVICE_IP="$1"
DEVICE_ENV="$2" # prod/staging
echo "[INFO] Rebooting $DEVICE_IP ($DEVICE_ENV)..."
ssh -o ConnectTimeout=3 -o BatchMode=yes ubuntu@"$DEVICE_IP" 'sudo reboot &'
# Wait 90s, then verify
sleep 90
if ! ssh -o ConnectTimeout=5 -o BatchMode=yes ubuntu@"$DEVICE_IP" 'echo "up"' 2>/dev/null; then
echo "[FAIL] Device $DEVICE_IP did not come back online"
exit 1
fi
echo "[OK] Device $DEVICE_IP rebooted and responsive"
n8n’s Slack trigger parses /reboot 10.10.20.42 prod, validates the IP against your devices.json (yes, it ships with a JSON inventory schema), and executes that script with full stdout/stderr capture. No eval. No Python shell injection. Just POSIX.
Why Self-Host This? (Spoiler: It’s Not for Everyone)
This isn’t for teams using Datadog + PagerDuty + Terraform Cloud + Slack + Jira + GitHub Actions. Those workflows already have orchestration. This is for:
- Solo infrastructure engineers managing >50 devices without a platform team
- Embedded SREs in hardware startups running custom edge clusters (think: 120x Raspberry Pi gateways + 17x x86 gateways)
- Red teams / blue teams needing audit-trail-first, reproducible, non-LLM-black-box remediation
- Compliance-bound orgs (HIPAA, ISO 27001) that require immutable execution logs, no “model-generated” ambiguity
It’s not for you if:
- You expect a UI dashboard (there is none —
n8n’s UI is your dashboard) - You want automatic model switching (GPT-4o + Claude Code are hardcoded — you must use both)
- You need RBAC beyond n8n’s basic auth (no SSO, no LDAP, no SCIM)
- You run Windows servers (it assumes SSH + Bash — no WinRM, no PowerShell Core)
RAM usage? Idle: ~650MB (n8n) + ~120MB (nginx). Peak during agent execution: ~1.1GB (mostly n8n buffering stdout). CPU spikes are brief — under 2s — since LLM calls are async and non-blocking.
Comparison: n8n + Shell vs. Alternatives You’ve Tried
If you’ve wrestled with hubot, errbot, or mattermost-plugin-ai, here’s the real talk:
| Tool | LLM Integration | Execution Model | Auditability | Learning Curve | Last Commit |
|---|---|---|---|---|---|
| agentic-chatops | Dual-model (GPT-4o + Claude Code), schema-validated prompts | Shell-first, ssh/curl/jq only |
Full CLI trace + n8n execution log + Git history of ./agents/ |
Medium (Shell + n8n UI) | Apr 2024 |
hubot-llm |
GPT-3.5 only, prompt injection risk | Node.js child_process.exec — no sandboxing |
Logs stdout, but no diff of before/after config |
Low (if you know CoffeeScript) | Dec 2022 |
chatops-ai (Python) |
Mix of Ollama + OpenAI, no fallback logic | subprocess.run() — shell=True, capture_output=True — dangerous |
Partial logs, no rollback state | High (Poetry, Flask, async mess) | Jan 2023 |
n8n-ai-chatops (community) |
Single LLM, no agent patterns | REST-only — can’t SSH or run local scripts | Webhook payloads only, no agent trace | Medium (n8n + Python) | Mar 2024 |
The kicker? agentic-chatops implements all 21 patterns, including Tool Calling with Validation, Self-Correction Loop, Multi-Agent Handoff, and Failure-Driven Replanning. Example: when reboot-device.sh fails, n8n doesn’t just alert — it triggers check-logs.sh, then rollback-config.sh, then notify-oncall.sh, all with the original Slack user context preserved. No other ChatOps repo does that.
The Rough Edges — And Why I Still Deployed It
Let’s be honest: this isn’t polished. I’ve been running it for 2 weeks across 42 devices (my lab + staging), and here’s what’s rough:
- No built-in secrets manager — API keys live in
docker-compose.ymlenv vars. You must usedotenvor HashiCorp Vault + n8n’s HTTP node to inject at runtime. I usevault kv get -field=anthropic_api_key secret/chatops. - SSH key management is manual — no
ssh-agentforwarding or~/.ssh/configauto-load. You mustssh-addkeys on the host beforedocker compose up. I added apre-start.shthat runsssh-add -D && ssh-add ~/.ssh/id_ed25519_chatops. - No metrics exporter — no Prometheus
/metricsendpoint. You can scrape n8n’s/metrics(it supports it), but the repo doesn’t configure it. I added aprometheus.ymlscrape config forn8n:5678/metrics. - Claude Code timeouts are aggressive — default timeout is 8s. On a slow API call, it fails silently and falls back to GPT-4o. I patched
agents/llm/call-claude.shto retry 2x withsleep 2.
But here’s why I kept it: it works. My /status all command returns a live, colorized, Markdown-formatted table with uptime, last config hash, and TLS cert expiry — all rendered by Claude Code parsing curl -I output and GPT-4o summarizing. My /deploy app-v2.4.1 triggers git pull, ansible-playbook, curl -X POST /health, and grafana-snapshot — all in one flow.
And the devices.json inventory? It’s versioned in Git. Every agent run creates a commit: git commit -m "reboot-device.sh: 10.10.20.42 (prod) — initiated by @kyriakos". That’s not “observability” — that’s forensic-grade infrastructure provenance.
Final Verdict: Deploy It — But Not for Your CIO’s Dashboard
Is agentic-chatops production-ready? For a solo operator managing 137 devices — absolutely. For a team of 12? Only if you own the n8n instance, version your ./agents/ directory, and treat every .sh file like infrastructure-as-code.
It’s not a replacement for your existing observability stack. It’s the glue that lets you act on that stack — safely, audibly, and reproducibly.
The GitHub repo has 93 stars, 7 open issues (all about docs or edge cases — zero “crash on startup” bugs), and the README.md is 90% working curl examples and n8n webhook setup screenshots. No fluff. No roadmap PDF. Just Shell, n8n, and two LLMs doing what they’re good at.
TL;DR: If you’ve ever typed ssh prod-db-01 && sudo systemctl restart nginx && exit and then forgotten to check the logs, this is your upgrade path. It won’t replace your CI/CD. But it will replace 37 Slack messages, 4 tmux panes, and one very tired human.
Go clone it. Run ./dev-setup.sh. Break something. Then fix it — and push the fix. Because that’s how agentic systems get better: one Shell script, one n8n node, one git commit at a time.
git clone https://github.com/papadopouloskyriakos/agentic-chatops.git
cd agentic-chatops
cp env.example .env
nano .env # set keys, domain, etc.
docker compose up -d
Then /reboot 10.10.20.42 staging — and watch the logs in real time, with full traceability.
You’ll feel like a wizard. Or at least, slightly less like a sysadmin who Googles “bash check if ssh host is up” for the 4,382nd time.
Comments