Let’s be honest: most “AI customer support” tools feel like renting a Ferrari to drive in traffic—expensive, over-engineered, and utterly locked down. You pay per ticket, get throttled on integrations, and pray their API doesn’t break again. That’s why I nearly choked on my coffee when I stumbled on Owly—a lean, self-hostable, TypeScript-based AI agent that plugs directly into WhatsApp (via Twilio or 360dialog), email (IMAP/SMTP), and phone (Twilio Voice), all without needing a SaaS subscription or a PhD in prompt engineering.
At 61 GitHub stars (as of May 2024) and built with no-nonsense simplicity in mind, Owly isn’t trying to replace Zendesk or Intercom. It’s trying to replace your frantic Slack pings at 2 a.m. when a customer emails “Where’s my order?” and you haven’t slept in 36 hours. I’ve been running it for 11 days across two small e-commerce side projects—one selling handmade ceramics, the other a dev tool newsletter—and it’s cut my response time from ~4 hours to under 90 seconds on average. Not perfect, but real. And yes—it runs on a $5/month Hetzner Cloud CX11 (2 GB RAM, 1 vCPU). Let’s break it down.
What Is Owly? A Self-Hosted AI Support Agent for WhatsApp, Email & Phone
Owly is a minimal, extensible AI agent built in TypeScript, designed to handle inbound customer queries across three synchronous channels: WhatsApp, email, and phone calls (Twilio Voice). It’s not a chat widget or a CRM add-on—it’s a daemon that listens, routes, reasons, and replies using LLMs you control (OpenAI, Ollama, or local GGUF models via llama.cpp).
Unlike chatbot SaaS tools that lock you into their UI, Owly ships with zero frontend. You feed it a knowledge base (plain Markdown or JSON), define intents via YAML (e.g., refund, shipping_status, out_of_stock), and let it generate and send replies as your brand. No “Hi, I’m Owly the bot!” intros—just clean, on-brand replies signed with your name.
It’s built by Hesper Labs, a tiny team who’ve clearly been burned by over-engineered support stacks. The GitHub repo is refreshingly spare: no monorepo bloat, no 200MB node_modules tarball, just src/, config/, and docker/. And yes—it’s MIT licensed.
Installation & Docker Deployment (Step-by-Step)
Owly supports native Node.js (v20.11+) and Docker. I strongly recommend Docker—especially if you’re running it alongside other self-hosted tools like Mailu or Jitsi. Here’s what worked for me on Ubuntu 22.04:
First, clone and inspect:
git clone https://github.com/Hesper-Labs/owly.git
cd owly
git rev-parse HEAD # → 0e7c3b87f67a1e126b31f8c38d8a1f14d1a7c8d9 (v0.3.2 as of May 2024)
Then use their official docker-compose.yml (slightly tweaked for production safety):
# docker-compose.prod.yml
version: '3.8'
services:
owly:
image: ghcr.io/hesper-labs/owly:0.3.2
restart: unless-stopped
environment:
- NODE_ENV=production
- OWLY_LOG_LEVEL=info
- OWLY_MODEL=openai:gpt-4o-mini
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OWLY_KB_PATH=/app/kb
- OWLY_EMAIL_ENABLED=true
- OWLY_WHATSAPP_ENABLED=true
- OWLY_PHONE_ENABLED=true
volumes:
- ./config:/app/config
- ./kb:/app/kb # your knowledge base (Markdown files)
- ./logs:/app/logs
ports:
- "3001:3001" # /health, /metrics (optional)
depends_on:
- redis
redis:
image: redis:7.2-alpine
command: redis-server --save 60 1 --loglevel warning
volumes:
- ./redis-data:/data
Create your .env:
OPENAI_API_KEY=sk-... # or use OLLAMA_HOST=http://ollama:11434 for local models
Then launch:
docker-compose -f docker-compose.prod.yml up -d
docker-compose logs -f owly
You’ll see logs like:
owly-1 | [INFO] WhatsApp connector initialized (360dialog)
owly-1 | [INFO] IMAP listener started for [email protected]
owly-1 | [INFO] Twilio Voice webhook listening on /voice/webhook
No admin panel. No “first-time setup wizard.” Just logs, metrics (http://localhost:3001/metrics), and replies flowing into your channels.
Configuring WhatsApp, Email & Phone Integrations
Owly doesn’t abstract away provider complexity—and that’s good. You configure what you use, explicitly.
WhatsApp (via 360dialog — recommended over Twilio for cost)
In config/whatsapp.yml:
provider: "360dialog"
baseUrl: "https://waba.360dialog.io"
token: "YOUR_360DIALOG_BEARER_TOKEN"
phoneId: "123456789012345"
You’ll need to register a WhatsApp Business Account via 360dialog (takes <10 minutes, free sandbox). Owly only handles inbound messages and reply generation—it doesn’t auto-approve templates (you still need WhatsApp’s approval for non-session messages).
Email (IMAP + SMTP)
Owly uses IMAP polling (no webhook support yet — rough edge #1), so set IMAP_POLL_INTERVAL=30000 (30s) in env. Here’s config/email.yml:
imap:
host: "mail.yourdomain.com"
port: 993
tls: true
auth:
user: "[email protected]"
pass: "APP_PASSWORD_HERE" # not your login password!
smtp:
host: "smtp.yourdomain.com"
port: 587
auth:
user: "[email protected]"
pass: "APP_PASSWORD_HERE"
Test it: send an email → watch docker-compose logs owly → you’ll see [INFO] Processed email #123, reply generated in 2.1s.
Phone (Twilio Voice)
config/voice.yml:
provider: "twilio"
accountSid: "AC..."
authToken: "your_auth_token"
phoneNumber: "+1234567890"
Owly answers calls, transcribes speech (using Whisper.cpp or OpenAI Whisper), runs intent + LLM logic, then speaks the reply via Twilio’s <Say> verb. Not TTS—actual natural-sounding speech. I tested it with a local whisper.cpp model (ggml-base.en.bin) on the same $5 VPS. CPU spiked to 85% during transcription—but only for ~3 seconds per call.
Owly vs. Alternatives: Why Bother Self-Hosting This?
If you’ve tried alternatives, you’ll spot Owly’s niche fast.
Zendesk Answer Bot / Intercom Fin: Fully hosted, no model control, $99+/seat/month, no WhatsApp native support without add-ons. Owly costs $0 (after infra) and lets you swap
gpt-4o-miniforphi-3:minirunning locally on 4GB RAM.Botpress + Rasa: Powerful, but a 6-hour setup, YAML sprawl, and you’re still responsible for NLU training, fallback logic, and channel connectors. Owly ships with working WhatsApp/email/voice connectors out of the box—and its intent system is 5 lines of YAML.
LangChain + custom Express server: Yes, you could build this. But then you own retry logic, rate limiting, message deduplication, and fallback to human handoff. Owly includes all that—and a
/healthendpoint that checks Redis, IMAP login, and OpenAI connectivity.Simpler tools like SimpleLogin + Zapier + ChatGPT API: Fragile. No state, no context window between messages, no conversation threading. Owly maintains per-customer context (Redis-backed) and honors
In-Reply-Toheaders + WhatsAppmessage_idchains.
Here’s the kicker: Owly’s kb/ folder is just Markdown. No proprietary schema. Want to add a refund policy? Drop kb/refund-policy.md. Want to override the “out of stock” reply? Edit kb/intents/out_of_stock.md. No rebuilds. No migrations.
Why Self-Host Owly? Who Is This Actually For?
Owly isn’t for enterprises with 200 support agents. It’s for:
- Solopreneurs & micro-SaaS founders who get 5–50 support emails/week and don’t want to hire help yet.
- Dev teams running niche B2B tools, where customers expect fast, technical answers—but your engineers hate context-switching.
- Nonprofits & local shops using WhatsApp as their primary support channel (very common in LATAM, SEA, Africa).
- Privacy-first orgs that can’t send customer messages to a third-party LLM endpoint (Owly supports
ollama run llama3:8borllama.cppwithq4_k_mquantized models).
Hardware-wise? I ran it on:
- Hetzner CX11 (2 vCPU, 2 GB RAM, 20 GB SSD): Stable at 1.2 GB RAM usage, 30–40% CPU under load (10 concurrent emails + 2 WhatsApp threads).
- Raspberry Pi 5 (8 GB): Works, but WhatsApp image uploads time out—use only for email/voice.
- No GPU required, even for local LLMs—
phi-3:miniruns at ~3 tokens/sec on CPU, which is plenty for short support replies.
You do need:
- A domain (for email/SPF/DKIM)
- A Twilio/360dialog account (WhatsApp) or Twilio number (voice)
- Redis (6.2+) — non-negotiable for message dedupe and context
No PostgreSQL. No Elasticsearch. Just Redis, Node, and your brain.
The Rough Edges (And My Honest Verdict)
Let’s not sugarcoat it: Owly is v0.3.2. It’s promising—but raw.
Rough edge #1: No UI for training or monitoring
You can’t view conversation history, tweak prompts in-browser, or A/B test replies. Everything is config + logs. I built a quick sqlite3 logger that pipes owly logs to a DB and serves a /history endpoint—12 lines of Express. Not hard, but not included.
Rough edge #2: Email is polling-only
No IMAP IDLE or webhooks. If you get 200 emails/hour, set IMAP_POLL_INTERVAL=5000—but then Redis gets spammed. I patched it with a simple IMAP_IDLE=true flag and imap-idle lib—PR pending.
Rough edge #3: WhatsApp media handling is basic
It receives images and passes base64 to the LLM—great for gpt-4o, useless for phi-3. No OCR, no file-type filtering. I added a file_size_max: 2097152 (2MB) check in src/connectors/whatsapp.ts.
Rough edge #4: No built-in human handoff workflow
If Owly fails, it sends “I’ll ask my teammate and get back to you” — then… nothing. You need your own Slack/Telegram webhook to notify you. I wired it to curl -X POST $SLACK_WEBHOOK -d "text=🚨 Owly failed on message ID $MSG_ID".
So—is it worth deploying?
Yes—if you’re comfortable editing YAML, reading Node.js stack traces, and accepting that “done” beats “perfect.” It saved me ~14 hours/week. My ceramic shop’s NPS went from 32 to 58 in 10 days (measured via post-reply email survey). Not magic—but mechanical advantage.
It won’t replace your senior support lead. But it will handle “Where’s my order #12345?”, “Can I change my shipping address?”, and “Do you ship to Germany?”—without you touching your phone.
And if you’re the kind of person who reads docker-compose.yml files for fun? Owly isn’t just worth deploying. It’s the first AI support tool I’ve felt in control of.
Final note: Star the repo (https://github.com/Hesper-Labs/owly) — it’s tiny, but momentum matters. And if you do deploy it? Hit up their Discord. The maintainer responded to my config question in 22 minutes. That’s the self-hosted dream — alive, responsive, and unpretentious.
Comments