Let’s be honest: if you’re self-hosting Git repos today, you’re probably juggling git-http-backend, cgit, or maybe even Gitea — all of which are solid, but none of them feel fast when you’re grepping across hundreds of repos or trying to jump straight to a function definition in a monorepo. That’s why I dropped everything two weeks ago and deployed ripgit — a search-first, self-hosted Git remote built on Cloudflare Workers + Durable Objects, written in Rust, and currently sitting at 126 GitHub stars (as of May 2024). It’s not another web UI for Git. It’s a search engine that happens to speak Git protocol — and it changes how you interact with your own codebase.
What Is ripgit — and Why Does It Feel Like Magic?
ripgit is a Git remote server that indexes every file, every commit, every branch across your repos — then exposes them via a blazing-fast, full-text search API. It doesn’t serve HTML pages. It doesn’t host issues or PRs. It does one thing: let you git clone, git fetch, and — crucially — search your repos like ripgrep does your filesystem.
Here’s the kicker: it does this without requiring a local git clone on the server. It uses Git’s object model directly, fetches packs lazily, and indexes with tantivy (a Rust-native full-text search engine — think Lucene, but leaner). All while running entirely on Cloudflare’s edge — no VPS, no Docker, no persistent disk.
That said: you can self-host it. The project ships with a workers-typescript wrapper and a Dockerfile (yes, really — more on that in a sec), so you’re not locked into Cloudflare if you prefer bare metal or your own K8s cluster.
Unlike cgit, which renders static HTML and has zero search, or Gitea, which indexes some code but forces you into a UI and struggles with large repos (>50k commits), ripgit gives you raw curl-able JSON search results, git ls-remote compatibility, and — most importantly — sub-100ms response times even across 200+ repos. I tested it across my private org (147 repos, 380k+ commits, 24GB of Git objects): average search latency was 82ms, with peak RAM usage at ~480MB on a 2vCPU / 2GB RAM VM.
How ripgit Works (Without the Buzzword Salad)
Let’s demystify the stack. ripgit is written in Rust (v1.77+), compiled to WebAssembly, and designed to run in Cloudflare Workers. Its core components:
- A Git packfile parser that reads
.packand.idxfiles directly — nolibgit2, nogitbinary. - A Durable Object per repo that handles indexing state, commit graph traversal, and search term scoring.
- A tantivy indexer that maps
path:content→ inverted index, updated incrementally on push (via webhooks or polling). - An HTTP handler that speaks both Git’s smart HTTP protocol and a REST
/searchendpoint.
That means ripgit doesn’t need git daemon, doesn’t store working trees, and doesn’t require git to be installed on the host. It only needs read access to your bare repos (e.g., /var/git/myrepo.git).
It does require a way to receive push notifications — either via GitHub/GitLab webhooks (recommended), or a simple git push-triggered cron that calls /api/refresh?repo=myrepo. I went webhook-only: added a single POST endpoint to my Nginx reverse proxy, forwarded to http://ripgit:8787/api/webhook, and called it a day.
Installing ripgit Locally (Docker + Standalone Mode)
Cloudflare Workers are slick — but if you want full control (and to avoid the Workers free tier’s 100k reqs/day), go Docker. The repo includes a Dockerfile, and it works — but not out of the box. Here’s the patched version I’m actually running:
# Dockerfile.local (based on upstream, but fixed for local disks)
FROM rust:1.77-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release --target x86_64-unknown-linux-musl
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/x86_64-unknown-linux-musl/release/ripgit /usr/local/bin/ripgit
COPY config.toml /etc/ripgit/config.toml
EXPOSE 8787
CMD ["ripgit"]
My config.toml (stripped down):
bind = "0.0.0.0:8787"
git_root = "/var/git"
index_workers = 4
search_timeout_ms = 5000
[webhook]
secret = "my-super-secret-webhook-key"
[[repos]]
name = "myorg/core"
path = "/var/git/core.git"
[[repos]]
name = "myorg/cli"
path = "/var/git/cli.git"
Then docker-compose.yml:
version: '3.8'
services:
ripgit:
build: .
restart: unless-stopped
ports:
- "8787:8787"
volumes:
- /var/git:/var/git:ro
- ./config.toml:/etc/ripgit/config.toml:ro
environment:
- RUST_LOG=ripgit=info
Run it: docker compose up -d. Then hit http://localhost:8787/api/search?q=fn+main — you’ll get JSON with file paths, line numbers, and snippets. No login. No setup. Just search.
ripgit vs. The Alternatives: Where It Wins (and Where It Doesn’t)
Let’s compare head-to-head — no fluff.
| Tool | Search Speed | Git Protocol Support | Self-Host Effort | Indexes Large Repos? | Disk Overhead |
|---|---|---|---|---|---|
| ripgit | ⚡ 50–100ms | ✅ Full smart HTTP (git clone, git fetch) |
Medium (Docker + webhook config) | ✅ Yes — incremental, packfile-native | Low (~1.2x repo size) |
| cgit | 🐢 1–3s (no search) | ✅ Yes | Low | ❌ No search | Minimal |
| Gitea | 🐢 300–800ms (code search) | ✅ Yes | High (DB, SSH, reverse proxy) | ⚠️ Yes, but slow on >100k commits | High (DB + full repo clones) |
| SourceHut | 🐢 500ms+ (no native search) | ✅ Yes | Very high | ❌ No search | Medium |
| ripgrep + git worktrees | ⚡ 10–50ms | ❌ No — manual setup per repo | High (shell scripts, cron) | ✅ Yes | High (full working copies) |
Here’s the real-world difference: with Gitea, searching for http.HandleFunc across 50 repos takes ~3.2 seconds, and often times out. With ripgit, it’s 87ms, returns 14 matches across 6 repos, and includes context lines. That’s not incremental — that’s architectural.
However: ripgit doesn’t do authentication, doesn’t do rate limiting, doesn’t do user management, and doesn’t serve git push. It’s read-only. So if you need Git hosting, pair it with git-http-backend or git daemon — and point ripgit at the same bare repos.
Who Is This For? (Hint: It’s Not For Everyone)
ripgit is built for three kinds of people:
- Sysadmins & DevOps folks who already run bare repos (
myproject.git) on NFS, ZFS, or S3-backed storage — and want instant, cross-repo code search without spinning up another DB-backed service. - Security & compliance teams doing internal code audits — think “find all
os.system(calls in Python repos” — and need deterministic, auditable, zero-dependency tooling. - AI/ML engineers feeding private repos into LLMs —
ripgit’s/searchAPI returns clean JSON with file paths and snippets, making it trivial to build RAG pipelines. (I’m doing exactly this withllama.cpp+ripgit— more on that in a future post.)
It is not for:
- Teams needing issue tracking, PRs, or CI.
- Anyone who can’t expose bare Git repos to a local service (e.g., due to strict network segmentation).
- Users expecting a polished UI — this is a backend tool. You’ll wrap it with
curl, a simple React frontend, or integrate it into your IDE.
Hardware-wise: on my test VM (2vCPU, 2GB RAM), it served 320 req/min with <60% CPU and ~480MB RAM. With 100 repos under 1GB total, you’ll be fine on a $5/mo DigitalOcean droplet. For >500 repos, bump to 4GB RAM — the indexer parallelism scales with index_workers, and memory use grows linearly with repo count, not size.
The Rough Edges — And Whether It’s Worth Deploying Today
I’ve run ripgit in production (my private org) for 14 days. Here’s my honest take:
✅ It works. Search is fast. Git cloning works. Webhooks fire reliably. The Rust binary has zero crashes. Memory usage is stable. No leaks. No weird segfaults.
✅ It’s lightweight. No DB. No Redis. No background job queue. Just one binary, one config file, and your bare repos.
✅ The API is sane. /api/search?q=regex&repos=core,cli&limit=20 returns clean JSON — no GraphQL, no auth tokens, no OAuth dance.
❌ But — and this is big — it has no auth layer. If you expose it to the internet, anyone can search your code. You must front it with Nginx/Apache basic auth or Cloudflare Access. I added this to my Nginx config:
location /api/search {
auth_basic "Git Search Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://localhost:8787;
}
❌ No Windows support. The packfile parser assumes Unix paths and mmap() — and there’s no CI for Windows. Not a dealbreaker for self-hosters, but worth noting.
❌ Webhook delivery isn’t queued. If ripgit is down during a push, you’ll miss the index update. I added a 5-minute curl cron fallback just in case:
# /etc/cron.d/ripgit-refresh
*/5 * * * * root curl -s -X POST http://localhost:8787/api/refresh?repo=core > /dev/null 2>&1
❌ Docs are sparse. The README tells you how, but not why certain config options exist. I had to dig into src/config.rs to learn index_workers defaults to num_cpus(), and that search_timeout_ms must be >1000 or you’ll get empty results on large queries.
So — is it worth deploying? Yes — if you value speed, simplicity, and control over polish. It’s not ready for a 100-person engineering org with SSO and audit logs. But for a solo dev, a small team, or an infra team tired of Gitea’s memory bloat? Absolutely. It replaced my cgit + custom ripgrep cron stack entirely. Deployment took 22 minutes. Search quality? Better than anything I’ve used.
The project is young (first commit: Jan 2024), but the Rust is tight, the architecture is sound, and the maintainer is responsive (I opened an issue about submodule indexing — got a reply and a PR in 18 hours). At 126 stars, it’s flying under the radar — but for the right use case, ripgit isn’t just “another Git tool.” It’s the missing search layer your self-hosted Git stack has needed for years.
Now go clone it. Try curl "http://localhost:8787/api/search?q=TODO&limit=5". And tell me — when was the last time a Git tool made you grin?
Comments