CognoDB
A cypher-compatible context graph database for AI Agents written in Go

cognodb.com · GitHub · Deploy · Docs


CognoDB is an open-source graph database that speaks the Bolt protocol (v5.0–5.4) and understands Cypher. Connect with any official Neo4j driver — Python, Go, JavaScript, Java, .NET — no code changes required.

Quick Start

The fastest way to get running:

docker run -p 7687:7687 ghcr.io/wexaai/cognodb:latest

Connect at bolt://localhost:7687 with username cognodb, password password.

Other install options

One-line installer (Linux & macOS binary)
curl -fsSL https://raw.githubusercontent.com/wexaai/cognodb/main/install.sh | bash

cognodb                   # in-memory (data lost on exit)
cognodb --storage local   # persistent on-disk storage

On Linux, run as root to also install and enable a systemd service automatically.

Docker Compose (quick-start)
git clone https://github.com/wexaai/cognodb.git && cd cognodb
docker compose up -d
Docker Compose + MongoDB (production)
git clone https://github.com/wexaai/cognodb.git && cd cognodb
docker compose -f deployments/docker/docker-compose.yml up -d

Starts CognoDB backed by a MongoDB replica set. Data persists across restarts.

Go install
go install github.com/wexaai/cognodb/cmd/cognodb@latest
cognodb
Build from source
git clone https://github.com/wexaai/cognodb.git
cd cognodb && make build
./bin/cognodb

See Deployment Guide for systemd, Kubernetes/Helm, and all configuration options.


MCP Server (optional)

CognoDB has a built-in Model Context Protocol server. It is completely optional — the database runs normally without it. When enabled, any MCP-compatible AI assistant (Claude, ChatGPT, Cursor, …) can query and write to your graph in natural language.

The MCP server exposes two tools:

Tool What it does
schema Returns all node labels, relationship types, indexes, and constraints. Call this first so the AI understands the data model.
query Runs any Cypher statement — MATCH, CREATE, MERGE, DELETE, SET, aggregations, shortestPath, schema DDL — and returns {columns, rows, count}.

Start the MCP server

# The Bolt server is NOT started in this mode; the process is a pure MCP server.
cognodb --mcp

# With persistent storage (graph survives restarts):
cognodb --mcp --storage local --data-dir ~/.cognodb

# HTTP/SSE transport — for web clients and ChatGPT.
# Bolt runs on :7687 as normal; MCP HTTP runs alongside on the chosen port.
cognodb --mcp-port 8811

# Both Bolt AND MCP HTTP at the same time:
cognodb --storage local --data-dir ~/.cognodb --mcp-port 8811

Connect Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows) and restart Claude:

{
  "mcpServers": {
    "cognodb": {
      "command": "cognodb",
      "args": ["--mcp", "--storage", "local", "--data-dir", "/Users/you/.cognodb"]
    }
  }
}

No local install? Use Docker instead:

{
  "mcpServers": {
    "cognodb": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-v", "/Users/you/.cognodb:/data",
        "ghcr.io/wexaai/cognodb:latest",
        "--mcp", "--storage", "local", "--data-dir", "/data"
      ]
    }
  }
}

Once connected, Claude can store and retrieve structured knowledge, build knowledge graphs from your conversations, and answer questions by running Cypher against the graph.

Verify it works

# Start the MCP HTTP server
cognodb --mcp-port 8811 &

# Health check
curl http://localhost:8811/health

# List available tools
curl -s -X POST http://localhost:8811/messages \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'

# Run a Cypher query through MCP
curl -s -X POST http://localhost:8811/messages \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"query","arguments":{"cypher":"CREATE (n:Person {name:\"Alice\"}) RETURN n.name"}}}'

See doc/mcp.md for the full HTTP/SSE protocol, ChatGPT plugin setup, and LLM integration test.


Benchmarks

All numbers are from bench/results.json and bench/token_results.json, run on a single 8-core Linux host. Databases are co-located (no network hop). Full methodology and raw data are in bench/.

Startup & footprint

CognoDB (in-memory) CognoDB (local) Neo4j 5 MongoDB 7
Cold start 7 ms 411 ms 17,051 ms 1,608 ms
Idle RAM 15 MB 18 MB ~3,500 MB 183 MB

CognoDB starts cold in 7 ms and idles at 15 MB — critical for agent workloads that spin up per-request or per-session.

Throughput & latency (10k nodes, 50k edges, degree-5 graph)

Load (edges/s) 3-hop p50 3-hop p95 Writes/s Aggregation
CognoDB in-memory 29,638 27.6 ms 33.4 ms 5,588 77.9 ms
CognoDB local 8,447 108.4 ms 143.4 ms 2,875 252.5 ms
CognoDB sharded (3×) 27,685 30.8 ms 35.4 ms 4,146 103.0 ms
CognoDB + MongoDB 1,362 177.6 ms 187.4 ms 342 199.4 ms
Neo4j 5 (community) 30,193 3.0 ms 3.7 ms 803 15.9 ms
MongoDB ($graphLookup) 181,159 1.7 ms 2.0 ms 346 33.6 ms

Neo4j and MongoDB are mature engines with years of query optimisation — CognoDB's in-memory mode is competitive on load and write throughput. Traversal latency headroom exists and is the active development focus.

Token cost for AI agents (graph-retrieval vs. full-context dump)

Agents that query CognoDB for relevant context instead of dumping the whole corpus into the prompt use dramatically fewer tokens — and the cost stays flat as the graph grows.

Graph size Full-context tokens CognoDB graph tokens Reduction
185 nodes 9,498 3,562 62%
462 nodes 23,779 3,014 87%
925 nodes 47,882 3,107 94%
1,850 nodes 95,528 2,671 97%
3,700 nodes 202,285 2,668 99%

Tokens are counted with the GPT-4 BPE tokenizer (tiktoken cl100k_base) over identical rendered text — no numbers invented. The graph-retrieval path fetches the 2-hop neighborhood around the query entities; the baseline dumps every node and edge.

Real LLM agent benchmark (tool-use loop)

Measured with actual LLM calls via AWS Bedrock. The agent uses CognoDB as a run_cypher tool and issues real Cypher queries per question; the baseline dumps the full corpus into the prompt.

Scenario Entities CognoDB tokens Full-context tokens Reduction Cost saved
Single-agent Q&A 100 2,600 in / 282 out 11,744 in / 39 out 76%
Single-agent Q&A 500 2,657 in / 307 out 59,003 in / 61 out 95%
Single-agent Q&A 1,000 2,538 in / 252 out 117,684 in / 43 out 98%
100-agent workflow 150 329k total 2,025k total 84% $1.60 / run

Run the benchmarks yourself:

# Token benchmark (no LLM calls, uses tiktoken)
pip install tiktoken neo4j
bin/cognodb &
python bench/token_benchmark.py

# Performance benchmark (requires running Neo4j and MongoDB for comparison)
go run ./bench --help

Features

  • Full Bolt protocol — works with all official Neo4j drivers out of the box
  • Cypher query language — CREATE, MATCH, MERGE, DELETE, SET, REMOVE, WITH, UNWIND, ORDER BY, SKIP, LIMIT, DISTINCT, and more
  • Variable-length path traversal[*1..5], shortestPath, allShortestPaths
  • Schema operations — indexes (property, composite), constraints (UNIQUE, EXISTS, NODE KEY)
  • ACID transactions — BEGIN, COMMIT, ROLLBACK with full isolation
  • EXPLAIN / PROFILE — query plan introspection
  • 40+ built-in functions — scalar, string, math, list, temporal, type coercion
  • RBAC authentication — admin, architect, editor, reader, publisher roles
  • Pluggable storage — in-memory (default), local on-disk, MongoDB
  • Backup & restorecognodb-dump / cognodb-restore CLI tools
  • Neo4j migration — import from Neo4j CSV exports with neo2cognodb

Connect with Any Neo4j Driver

Python

from neo4j import GraphDatabase

driver = GraphDatabase.driver("bolt://localhost:7687", auth=("cognodb", "password"))

with driver.session() as session:
    session.run("""
        CREATE (a:Person {name: 'Alice', age: 30})
        CREATE (b:Person {name: 'Bob',   age: 25})
        CREATE (a)-[:KNOWS {since: 2020}]->(b)
    """)

    result = session.run("""
        MATCH (a:Person)-[:KNOWS]->(b:Person)
        RETURN a.name AS from, b.name AS to
    """)
    for record in result:
        print(f"{record['from']} knows {record['to']}")

driver.close()

Go

package main

import (
    "context"
    "fmt"
    "github.com/neo4j/neo4j-go-driver/v5/neo4j"
)

func main() {
    ctx := context.Background()
    driver, _ := neo4j.NewDriverWithContext("bolt://localhost:7687",
        neo4j.BasicAuth("cognodb", "password", ""))
    defer driver.Close(ctx)

    session := driver.NewSession(ctx, neo4j.SessionConfig{})
    defer session.Close(ctx)

    session.Run(ctx, `CREATE (a:Person {name: 'Alice'})-[:KNOWS]->(b:Person {name: 'Bob'})`, nil)

    result, _ := session.Run(ctx, `MATCH (n:Person) RETURN n.name AS name`, nil)
    for result.Next(ctx) {
        fmt.Println(result.Record().Values[0])
    }
}

JavaScript

const neo4j = require("neo4j-driver");

const driver = neo4j.driver(
  "bolt://localhost:7687",
  neo4j.auth.basic("cognodb", "password")
);
const session = driver.session();

await session.run(
  "CREATE (a:Person {name: $name})-[:KNOWS]->(b:Person {name: $friend})",
  { name: "Alice", friend: "Bob" }
);

const result = await session.run("MATCH (n:Person) RETURN n.name AS name");
result.records.forEach((r) => console.log(r.get("name")));

await session.close();
await driver.close();

Java

import org.neo4j.driver.*;

try (Driver driver = GraphDatabase.driver("bolt://localhost:7687",
        AuthTokens.basic("cognodb", "password"));
     Session session = driver.session()) {

    session.run("CREATE (a:Person {name: 'Alice'})-[:KNOWS]->(b:Person {name: 'Bob'})");

    Result result = session.run("MATCH (n:Person) RETURN n.name AS name");
    while (result.hasNext()) {
        System.out.println(result.next().get("name").asString());
    }
}

Tip: CognoDB also accepts neo4j as the username for drop-in compatibility with existing tooling and connection strings.

Cypher Support

CognoDB supports a broad subset of the Cypher query language. See Cypher Compatibility for the full matrix.

Category Features
Read MATCH, OPTIONAL MATCH, WHERE, RETURN, WITH, UNWIND, ORDER BY, SKIP, LIMIT, DISTINCT
Write CREATE, MERGE (ON CREATE/ON MATCH SET), DELETE, DETACH DELETE, SET, REMOVE
Paths Variable-length [*1..5], shortestPath, allShortestPaths
Aggregation COUNT, SUM, AVG, MIN, MAX, COLLECT (with DISTINCT)
Schema CREATE/DROP INDEX, CREATE/DROP CONSTRAINT, SHOW INDEXES, SHOW CONSTRAINTS
Transactions BEGIN, COMMIT, ROLLBACK
Analysis EXPLAIN, PROFILE
Expressions CASE, IN, CONTAINS, STARTS WITH, ENDS WITH, IS NULL, list/map literals, parameters

Architecture

Neo4j Driver ──► Bolt Server (PackStream codec, auth, Bolt 5.0–5.4)
                      │
                      ▼
                Cypher Parser (hand-written recursive descent, ~14.5 µs/op)
                      │
                      ▼
                AST + Semantic Analysis
                      │
                      ▼
                Optimizer (rule-based + cost-based)
                      │
                      ▼
                Execution Engine (Volcano pull-based iterators)
                      │
                      ▼
                Storage (in-memory │ local on-disk │ MongoDB │ sharded)

See Architecture for design decisions and storage schema details.

Docker Image

The official image is published to GitHub Container Registry and updated automatically on every release:

# Latest release
docker run -p 7687:7687 ghcr.io/wexaai/cognodb:latest

# Specific version
docker run -p 7687:7687 ghcr.io/wexaai/cognodb:0.1.0

# With persistent local storage
docker run -p 7687:7687 -v cognodb_data:/data \
  ghcr.io/wexaai/cognodb:latest --storage local --data-dir /data

Available tags: latest, main, and semver tags (0.1.0, 0.1, 0). Platforms: linux/amd64, linux/arm64.

CLI Tools

Tool Description
cognodb Main database server
cognodb-dump Export graph data to JSON Lines
cognodb-restore Import graph data from a dump
neo2cognodb Migrate from Neo4j CSV exports
# Export
cognodb-dump --out ./backup/

# Import
cognodb-restore --in ./backup/

# Migrate from Neo4j
neo2cognodb --nodes nodes.csv --rels rels.csv --out ./backup/

Configuration

Option Default Description
--host 0.0.0.0 Bolt listen address
--port 7687 Bolt listen port
--storage inmemory Backend: inmemory, local, or mongo
--data-dir ./data/cognodb Data directory for local backend
--mongo-uri MongoDB connection string
--sharded false Enable sharded mode
--num-shards 3 Number of shards
--version Print version and exit

See Configuration Reference for query limits, cache tuning, and security settings.

Documentation

Used in Production

CognoDB powers the personal and enterprise knowledge graphs at Wexa.ai — an AI workspace that builds a persistent, queryable graph of your work across tools, people, and projects. Every relationship, document reference, and workflow dependency is stored as a graph node and traversed in real time as agents answer questions and take actions.

If you're using CognoDB in production, feel free to open a PR to add yourself here.

Security

To report a security vulnerability, please see SECURITY.md.