██╗  ██╗███████╗██╗     ██╗██╗  ██╗
  ██║  ██║██╔════╝██║     ██║╚██╗██╔╝
  ███████║█████╗  ██║     ██║ ╚███╔╝
  ██╔══██║██╔══╝  ██║     ██║ ██╔██╗
  ██║  ██║███████╗███████╗██║██╔╝ ██╗
  ╚═╝  ╚═╝╚══════╝╚══════╝╚═╝╚═╝  ╚═╝
  Decode the digital DNA of any identity

Map a target's entire digital footprint from a single username or email — with API-validated results, near-zero false positives, and an interactive D3.js relational graph that shows how platforms link to each other.


Features

🔍 70+ builtin platforms Social, dev, content, gaming, forums
🧩 400+ via Sherlock --sherlock loads Sherlock's database at runtime
Near-zero false positives API endpoints, OG <meta> tags, and text fingerprinting
🌐 WAF/Cloudflare bypass curl_cffi TLS impersonation (optional)
🕸️ Relational D3.js graph Cross-links drawn from actual bio link extraction
📧 Email mode Gravatar + --holehe for 120+ platform email checks
🔀 Permutations Scans johndoe1, john.doe, realjohndoe, etc.
📊 Reports HTML graph · JSON · CSV · TXT

Install

git clone https://github.com/thalha-a9/helix.git
cd helix
pip install -r requirements.txt

Optional — WAF bypass (Twitter, Instagram, TikTok, etc.):

pip install curl-cffi

Optional — Deep email scanning:

pip install holehe

Usage

# Scan a username
python helix.py -u johndoe

# Username + email (two root nodes in graph)
python helix.py -u johndoe -e [email protected]

# Deep email scan via holehe (120+ platforms)
python helix.py -u johndoe -e [email protected] --holehe

# Load Sherlock's 400+ platform database
python helix.py -u johndoe --sherlock

# Scan username permutations (johndoe1, john.doe, etc.)
python helix.py -u johndoe --permutations

# Full output bundle, no browser
python helix.py -u johndoe --format all --no-browser

# Slow connection? Increase Sherlock fetch timeout
python helix.py -u johndoe --sherlock --sherlock-timeout 60

False Positive Prevention

Helix uses the right tool for each platform instead of naive HTTP 200 checks:

Platform Detection method Why
Reddit reddit.com/user/{u}/about.json → HTTP 404/200 JSON API, zero ambiguity
Chess.com api.chess.com/pub/player/{u} → HTTP 404/200 Official public API
Lichess lichess.org/api/user/{u} → HTTP 404/200 Official public API
Bluesky AT Protocol API → HTTP 400/200 SPA — HTML is useless
GitHub og:title parsed + validated Server-side rendered, reliable
Medium og:title — rejects homepage redirect title Catches "Where good ideas find you"
Ko-fi text_present — checks for profile URL in page Only appears on real profiles
PyPI text_present — checks for published packages Empty accounts filtered out
Twitter/X curl_cffi TLS impersonation Skipped gracefully without it
Replit og:title contains @username Fixed {username} interpolation bug

Graph

The HTML output is a standalone, zero-dependency interactive network graph:

  • Drag nodes · Scroll to zoom · Click to open profile
  • Hover for confidence level, OG title, bio-linked partners
  • Bright green edges = bio-extracted cross-links (proven connections)
  • Amber edges = platform found via both username AND email
  • Green ring = high confidence (OG meta validated)
  • ⌕ Search · ◌ Not-found toggle · ☰ Label toggle

Architecture

helix/
├── helix.py                        ← CLI entry point
├── osint/
│   ├── checker.py                  ← Async engine (aiohttp + curl_cffi)
│   ├── platforms.py                ← 70+ platform definitions
│   ├── graph.py                    ← D3.js relational graph generator
│   ├── report.py                   ← JSON / CSV / TXT exporters
│   ├── permutations.py             ← Username variation generator
│   └── adapters/
│       ├── sherlock_adapter.py     ← Runtime Sherlock data.json ingestion
│       └── holehe_adapter.py       ← holehe email scanner wrapper
└── results/                        ← Output (git-ignored)

Requirements

  • Python 3.9+
  • aiohttp (required)
  • curl-cffi (optional — WAF bypass for Twitter, Instagram, TikTok, Patreon)
  • holehe (optional — deep email scanning across 120+ platforms)


Author: Thalha Ahmed · @thalha-a9 License: MIT