CognoDB
A cypher-compatible context graph database for AI Agents written in Go
cognodb.com · GitHub · Deploy · Docs
CognoDB is an open-source graph database that speaks the Bolt protocol (v5.0–5.4) and understands Cypher. Connect with any official Neo4j driver — Python, Go, JavaScript, Java, .NET — no code changes required.
Quick Start
The fastest way to get running:
docker run -p 7687:7687 ghcr.io/wexaai/cognodb:latest
Connect at bolt://localhost:7687 with username cognodb, password password.
Other install options
One-line installer (Linux & macOS binary)
curl -fsSL https://raw.githubusercontent.com/wexaai/cognodb/main/install.sh | bash
cognodb # in-memory (data lost on exit)
cognodb --storage local # persistent on-disk storage
On Linux, run as root to also install and enable a systemd service automatically.
Docker Compose (quick-start)
git clone https://github.com/wexaai/cognodb.git && cd cognodb
docker compose up -d
Docker Compose + MongoDB (production)
git clone https://github.com/wexaai/cognodb.git && cd cognodb
docker compose -f deployments/docker/docker-compose.yml up -d
Starts CognoDB backed by a MongoDB replica set. Data persists across restarts.
Go install
go install github.com/wexaai/cognodb/cmd/cognodb@latest
cognodb
Build from source
git clone https://github.com/wexaai/cognodb.git
cd cognodb && make build
./bin/cognodb
See Deployment Guide for systemd, Kubernetes/Helm, and all configuration options.
MCP Server (optional)
CognoDB has a built-in Model Context Protocol server. It is completely optional — the database runs normally without it. When enabled, any MCP-compatible AI assistant (Claude, ChatGPT, Cursor, …) can query and write to your graph in natural language.
The MCP server exposes two tools:
| Tool | What it does |
|---|---|
schema |
Returns all node labels, relationship types, indexes, and constraints. Call this first so the AI understands the data model. |
query |
Runs any Cypher statement — MATCH, CREATE, MERGE, DELETE, SET, aggregations, shortestPath, schema DDL — and returns {columns, rows, count}. |
Start the MCP server
# The Bolt server is NOT started in this mode; the process is a pure MCP server.
cognodb --mcp
# With persistent storage (graph survives restarts):
cognodb --mcp --storage local --data-dir ~/.cognodb
# HTTP/SSE transport — for web clients and ChatGPT.
# Bolt runs on :7687 as normal; MCP HTTP runs alongside on the chosen port.
cognodb --mcp-port 8811
# Both Bolt AND MCP HTTP at the same time:
cognodb --storage local --data-dir ~/.cognodb --mcp-port 8811
Connect Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows) and restart Claude:
{
"mcpServers": {
"cognodb": {
"command": "cognodb",
"args": ["--mcp", "--storage", "local", "--data-dir", "/Users/you/.cognodb"]
}
}
}
No local install? Use Docker instead:
{
"mcpServers": {
"cognodb": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-v", "/Users/you/.cognodb:/data",
"ghcr.io/wexaai/cognodb:latest",
"--mcp", "--storage", "local", "--data-dir", "/data"
]
}
}
}
Once connected, Claude can store and retrieve structured knowledge, build knowledge graphs from your conversations, and answer questions by running Cypher against the graph.
Verify it works
# Start the MCP HTTP server
cognodb --mcp-port 8811 &
# Health check
curl http://localhost:8811/health
# List available tools
curl -s -X POST http://localhost:8811/messages \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
# Run a Cypher query through MCP
curl -s -X POST http://localhost:8811/messages \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"query","arguments":{"cypher":"CREATE (n:Person {name:\"Alice\"}) RETURN n.name"}}}'
See doc/mcp.md for the full HTTP/SSE protocol, ChatGPT plugin setup, and LLM integration test.
Benchmarks
All numbers are from bench/results.json and bench/token_results.json, run on a single 8-core Linux host. Databases are co-located (no network hop). Full methodology and raw data are in bench/.
Startup & footprint
| CognoDB (in-memory) | CognoDB (local) | Neo4j 5 | MongoDB 7 | |
|---|---|---|---|---|
| Cold start | 7 ms | 411 ms | 17,051 ms | 1,608 ms |
| Idle RAM | 15 MB | 18 MB | ~3,500 MB | 183 MB |
CognoDB starts cold in 7 ms and idles at 15 MB — critical for agent workloads that spin up per-request or per-session.
Throughput & latency (10k nodes, 50k edges, degree-5 graph)
| Load (edges/s) | 3-hop p50 | 3-hop p95 | Writes/s | Aggregation | |
|---|---|---|---|---|---|
| CognoDB in-memory | 29,638 | 27.6 ms | 33.4 ms | 5,588 | 77.9 ms |
| CognoDB local | 8,447 | 108.4 ms | 143.4 ms | 2,875 | 252.5 ms |
| CognoDB sharded (3×) | 27,685 | 30.8 ms | 35.4 ms | 4,146 | 103.0 ms |
| CognoDB + MongoDB | 1,362 | 177.6 ms | 187.4 ms | 342 | 199.4 ms |
| Neo4j 5 (community) | 30,193 | 3.0 ms | 3.7 ms | 803 | 15.9 ms |
| MongoDB ($graphLookup) | 181,159 | 1.7 ms | 2.0 ms | 346 | 33.6 ms |
Neo4j and MongoDB are mature engines with years of query optimisation — CognoDB's in-memory mode is competitive on load and write throughput. Traversal latency headroom exists and is the active development focus.
Token cost for AI agents (graph-retrieval vs. full-context dump)
Agents that query CognoDB for relevant context instead of dumping the whole corpus into the prompt use dramatically fewer tokens — and the cost stays flat as the graph grows.
| Graph size | Full-context tokens | CognoDB graph tokens | Reduction |
|---|---|---|---|
| 185 nodes | 9,498 | 3,562 | 62% |
| 462 nodes | 23,779 | 3,014 | 87% |
| 925 nodes | 47,882 | 3,107 | 94% |
| 1,850 nodes | 95,528 | 2,671 | 97% |
| 3,700 nodes | 202,285 | 2,668 | 99% |
Tokens are counted with the GPT-4 BPE tokenizer (tiktoken cl100k_base) over identical rendered text — no numbers invented. The graph-retrieval path fetches the 2-hop neighborhood around the query entities; the baseline dumps every node and edge.
Real LLM agent benchmark (tool-use loop)
Measured with actual LLM calls via AWS Bedrock. The agent uses CognoDB as a run_cypher tool and issues real Cypher queries per question; the baseline dumps the full corpus into the prompt.
| Scenario | Entities | CognoDB tokens | Full-context tokens | Reduction | Cost saved |
|---|---|---|---|---|---|
| Single-agent Q&A | 100 | 2,600 in / 282 out | 11,744 in / 39 out | 76% | — |
| Single-agent Q&A | 500 | 2,657 in / 307 out | 59,003 in / 61 out | 95% | — |
| Single-agent Q&A | 1,000 | 2,538 in / 252 out | 117,684 in / 43 out | 98% | — |
| 100-agent workflow | 150 | 329k total | 2,025k total | 84% | $1.60 / run |
Run the benchmarks yourself:
# Token benchmark (no LLM calls, uses tiktoken)
pip install tiktoken neo4j
bin/cognodb &
python bench/token_benchmark.py
# Performance benchmark (requires running Neo4j and MongoDB for comparison)
go run ./bench --help
Features
- Full Bolt protocol — works with all official Neo4j drivers out of the box
- Cypher query language — CREATE, MATCH, MERGE, DELETE, SET, REMOVE, WITH, UNWIND, ORDER BY, SKIP, LIMIT, DISTINCT, and more
- Variable-length path traversal —
[*1..5],shortestPath,allShortestPaths - Schema operations — indexes (property, composite), constraints (UNIQUE, EXISTS, NODE KEY)
- ACID transactions — BEGIN, COMMIT, ROLLBACK with full isolation
- EXPLAIN / PROFILE — query plan introspection
- 40+ built-in functions — scalar, string, math, list, temporal, type coercion
- RBAC authentication — admin, architect, editor, reader, publisher roles
- Pluggable storage — in-memory (default), local on-disk, MongoDB
- Backup & restore —
cognodb-dump/cognodb-restoreCLI tools - Neo4j migration — import from Neo4j CSV exports with
neo2cognodb
Connect with Any Neo4j Driver
Python
from neo4j import GraphDatabase
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("cognodb", "password"))
with driver.session() as session:
session.run("""
CREATE (a:Person {name: 'Alice', age: 30})
CREATE (b:Person {name: 'Bob', age: 25})
CREATE (a)-[:KNOWS {since: 2020}]->(b)
""")
result = session.run("""
MATCH (a:Person)-[:KNOWS]->(b:Person)
RETURN a.name AS from, b.name AS to
""")
for record in result:
print(f"{record['from']} knows {record['to']}")
driver.close()
Go
package main
import (
"context"
"fmt"
"github.com/neo4j/neo4j-go-driver/v5/neo4j"
)
func main() {
ctx := context.Background()
driver, _ := neo4j.NewDriverWithContext("bolt://localhost:7687",
neo4j.BasicAuth("cognodb", "password", ""))
defer driver.Close(ctx)
session := driver.NewSession(ctx, neo4j.SessionConfig{})
defer session.Close(ctx)
session.Run(ctx, `CREATE (a:Person {name: 'Alice'})-[:KNOWS]->(b:Person {name: 'Bob'})`, nil)
result, _ := session.Run(ctx, `MATCH (n:Person) RETURN n.name AS name`, nil)
for result.Next(ctx) {
fmt.Println(result.Record().Values[0])
}
}
JavaScript
const neo4j = require("neo4j-driver");
const driver = neo4j.driver(
"bolt://localhost:7687",
neo4j.auth.basic("cognodb", "password")
);
const session = driver.session();
await session.run(
"CREATE (a:Person {name: $name})-[:KNOWS]->(b:Person {name: $friend})",
{ name: "Alice", friend: "Bob" }
);
const result = await session.run("MATCH (n:Person) RETURN n.name AS name");
result.records.forEach((r) => console.log(r.get("name")));
await session.close();
await driver.close();
Java
import org.neo4j.driver.*;
try (Driver driver = GraphDatabase.driver("bolt://localhost:7687",
AuthTokens.basic("cognodb", "password"));
Session session = driver.session()) {
session.run("CREATE (a:Person {name: 'Alice'})-[:KNOWS]->(b:Person {name: 'Bob'})");
Result result = session.run("MATCH (n:Person) RETURN n.name AS name");
while (result.hasNext()) {
System.out.println(result.next().get("name").asString());
}
}
Tip: CognoDB also accepts
neo4jas the username for drop-in compatibility with existing tooling and connection strings.
Cypher Support
CognoDB supports a broad subset of the Cypher query language. See Cypher Compatibility for the full matrix.
| Category | Features |
|---|---|
| Read | MATCH, OPTIONAL MATCH, WHERE, RETURN, WITH, UNWIND, ORDER BY, SKIP, LIMIT, DISTINCT |
| Write | CREATE, MERGE (ON CREATE/ON MATCH SET), DELETE, DETACH DELETE, SET, REMOVE |
| Paths | Variable-length [*1..5], shortestPath, allShortestPaths |
| Aggregation | COUNT, SUM, AVG, MIN, MAX, COLLECT (with DISTINCT) |
| Schema | CREATE/DROP INDEX, CREATE/DROP CONSTRAINT, SHOW INDEXES, SHOW CONSTRAINTS |
| Transactions | BEGIN, COMMIT, ROLLBACK |
| Analysis | EXPLAIN, PROFILE |
| Expressions | CASE, IN, CONTAINS, STARTS WITH, ENDS WITH, IS NULL, list/map literals, parameters |
Architecture
Neo4j Driver ──► Bolt Server (PackStream codec, auth, Bolt 5.0–5.4)
│
▼
Cypher Parser (hand-written recursive descent, ~14.5 µs/op)
│
▼
AST + Semantic Analysis
│
▼
Optimizer (rule-based + cost-based)
│
▼
Execution Engine (Volcano pull-based iterators)
│
▼
Storage (in-memory │ local on-disk │ MongoDB │ sharded)
See Architecture for design decisions and storage schema details.
Docker Image
The official image is published to GitHub Container Registry and updated automatically on every release:
# Latest release
docker run -p 7687:7687 ghcr.io/wexaai/cognodb:latest
# Specific version
docker run -p 7687:7687 ghcr.io/wexaai/cognodb:0.1.0
# With persistent local storage
docker run -p 7687:7687 -v cognodb_data:/data \
ghcr.io/wexaai/cognodb:latest --storage local --data-dir /data
Available tags: latest, main, and semver tags (0.1.0, 0.1, 0).
Platforms: linux/amd64, linux/arm64.
CLI Tools
| Tool | Description |
|---|---|
cognodb |
Main database server |
cognodb-dump |
Export graph data to JSON Lines |
cognodb-restore |
Import graph data from a dump |
neo2cognodb |
Migrate from Neo4j CSV exports |
# Export
cognodb-dump --out ./backup/
# Import
cognodb-restore --in ./backup/
# Migrate from Neo4j
neo2cognodb --nodes nodes.csv --rels rels.csv --out ./backup/
Configuration
| Option | Default | Description |
|---|---|---|
--host |
0.0.0.0 |
Bolt listen address |
--port |
7687 |
Bolt listen port |
--storage |
inmemory |
Backend: inmemory, local, or mongo |
--data-dir |
./data/cognodb |
Data directory for local backend |
--mongo-uri |
— | MongoDB connection string |
--sharded |
false |
Enable sharded mode |
--num-shards |
3 |
Number of shards |
--version |
— | Print version and exit |
See Configuration Reference for query limits, cache tuning, and security settings.
Documentation
- Getting Started — installation, first queries, driver examples
- Deployment Guide — installer, Docker, Docker Compose, systemd, Kubernetes/Helm
- MCP Server — connect Claude Desktop, ChatGPT, and any MCP-compatible AI tool
- Developer Guide — complete Cypher tutorial, agent integration, operations
- Architecture — pipeline, design decisions, storage schema
- Configuration — server, storage, query limits, cache, security
- MongoDB Backend — production persistence backend, config, local replica set, operations
- Cypher Compatibility — supported features matrix
- Architecture Decision Records — rationale behind key design choices
Used in Production
CognoDB powers the personal and enterprise knowledge graphs at Wexa.ai — an AI workspace that builds a persistent, queryable graph of your work across tools, people, and projects. Every relationship, document reference, and workflow dependency is stored as a graph node and traversed in real time as agents answer questions and take actions.
If you're using CognoDB in production, feel free to open a PR to add yourself here.
Security
To report a security vulnerability, please see SECURITY.md.
Comments