xFreed0m/ghosttype: sSCS decorators — REAL, KNOWN badges only.

Local forensic scanner that extracts and verifies credentials from AI tool conversation history. Detection + verification powered by TruffleHog.

Read the original blog post: ghosttype — finding secrets in AI conversation history

Authorized use only. For licensed penetration testers, red teams, and DLP/blue teams operating under explicit written authorization. See THREAT-MODEL.md.

What it does

ghosttype scans AI tool conversation files for exposed credentials, then asks TruffleHog whether each one is actually live by hitting the issuing provider's verification endpoint. Findings are emitted as JSON + CSV, each linked back to the source conversation.

Two complementary detection engines (since v0.4.0):

TruffleHog — 800+ structural detectors with live API verification, entropy filtering, known-example exclusion. The only engine that can prove a credential is live.
In-tree pattern engine — 30 regex + 10 heuristic patterns. Offline, never verified, but catches loose variable-name context signals (api_key=, password=, JWT_SECRET=) that TruffleHog's structural detectors don't match.

By default both run and results are merged (--engine both); on a (secret_value, file) overlap the TruffleHog finding wins because it carries verification. Choose one with --engine {both,trufflehog,patterns}. ghosttype always owns the discovery layer — where each AI tool stores conversations and how to decode them. Every finding carries a source field so you know which engine produced it.

Supported AI tools:

Tool	Data source
Claude Code CLI	`~/.claude/projects/*/.jsonl` + history
Cursor IDE	`state.vscdb` (SQLite, global + workspace)
Codex CLI	`~/.codex/state_5.sqlite` + logs
ChatGPT Desktop	Keychain-backed `.data` files (AES-128-CBC)
Claude Desktop	Stub (path detected; extraction in progress)

Detected credential types: the full TruffleHog detector catalog (800+), including AWS, GitHub PATs, OpenAI / Anthropic, Stripe, Slack, HashiCorp Vault, Snowflake, Databricks, Linear, GCP service accounts, Azure, Twilio, Cloudflare, npm, Telegram, Hugging Face, DigitalOcean, Docker Hub, Pulumi, Doppler, PyPI, SendGrid, JWT, PEM private keys, and database connection strings. Each finding is marked verified: true if TruffleHog confirmed it live against the provider's API, or verified: false if the structure matched but verification was skipped, declined, or failed.

Requirements

Python 3.11+
TruffleHog 3.x installed and on PATH (or set GHOSTTYPE_TRUFFLEHOG_BIN)
- macOS: brew install trufflehog
- Linux: see installation docs
macOS for full tool coverage (Linux/Windows paths: roadmap)

Check your install:

ghosttype doctor

Quick start

git clone https://github.com/p4gs/ghosttype
cd ghosttype
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# Scan all detected AI tools (verifies every credential against its provider)
ghosttype scan

# Triage rotation work: only TruffleHog-verified live credentials
ghosttype scan --only-verified

# Fast offline pass: detect without hitting any provider APIs
ghosttype scan --no-verification

# Pipe to jq, filter by detector
ghosttype scan --no-verification --format json --output - --quiet \
  | jq '.[] | select(.detector_name == "Github")'

# Show which AI tools and TruffleHog are present
ghosttype doctor

Output

Default: ./ghosttype_report/findings.json + findings.csv

Each finding includes:

Field	Description
`tool`	Source AI tool (e.g. `claude_code`)
`detector_name`	TruffleHog detector name (e.g. `Github`, `AWS`)
`secret_type`	Detector name, lowercased
`severity`	`critical` / `high` / `medium` — derived from detector + verification state
`verified`	`true` if TruffleHog confirmed live against the provider, else `false`
`verification_error`	Verifier error message if verification was attempted and errored
`secret_value`	Plaintext value (use `--redact` to mask)
`file_path`	Source conversation file
`position`	Chunk position + line within the chunk
`confidence`	`verified` or `unverified`
`context`	Window of surrounding text
`extra_data`	TruffleHog detector extras (e.g. `rotation_guide` URLs)

All scan options

ghosttype scan [OPTIONS]

  --tool TEXT                  Scan one tool: cursor, chatgpt, codex, claude, claude_code
  --format [json|csv|both]     Output format (default: both)
  --output TEXT                Output dir, or - for stdout JSON (default: ./ghosttype_report)
  --engine [both|trufflehog|patterns]
                               Detection engine (default: both). 'patterns'
                               needs no TruffleHog binary.
  --redact                     Mask secret values in output
  --min-confidence             verified | unverified (default: unverified)
                               'verified' = TruffleHog-verified only;
                               'high' (legacy) also keeps regex pattern hits
  --only-verified              Pass --results=verified to TruffleHog
  --no-verification            Skip live verifier calls (fast, offline)
  --trufflehog-binary PATH     Override the TruffleHog binary
  --trufflehog-timeout SECONDS Outer timeout for the TruffleHog subprocess (default: 300)
  --max-age-days N             Only scan files modified within last N days
  --copy-sources               Copy source conversation files to output/sources/
  --allow-list PATH            Suppress known-safe values (one value per line)
  --stats-only                 Print summary statistics only
  --quiet / -q                 Suppress banner for scripting
  --context-window N           Context chars around match (default: 200)

ghosttype list-tools           Show detected AI tools on this machine
ghosttype doctor               Show TruffleHog binary, version, and detected tools
ghosttype version              Print version

Environment variables

GHOSTTYPE_TRUFFLEHOG_BIN — explicit path to TruffleHog binary (overridden by --trufflehog-binary)

Exit codes

0 — no findings
1 — at least one finding (enables CI/CD gating)
2 — environment problem (TruffleHog missing, subprocess failed, etc.)

Detection design

ghosttype is two layers stitched together:

[AI tool storage] --(scanner module)--> TextChunks --(trufflehog filesystem)--> Findings
   .jsonl/SQLite/encrypted         (extracted text)        (verified or unverified)

The discovery layer is the per-tool code under ghosttype/scanners/ — one module each for Claude Code, Cursor, Codex, ChatGPT, Claude Desktop. They know SQLite schemas, Electron safeStorage decryption, JSONL message shapes.

The detection + verification layer is a TruffleHog subprocess. ghosttype writes each extracted text chunk to a temp file with a deterministic name, runs trufflehog filesystem --json --no-update [...] <tmpdir>, parses NDJSON results, and maps each one back to the originating conversation record via the temp filename. The temp dir is deleted in a finally block; nothing persists.

See ARCHITECTURE.md for the full pipeline diagram.

Security & threat model

ghosttype is forensic — it reads files you already have access to and runs detectors locally. The only network traffic is TruffleHog's own verification calls to credential issuers (AWS, GitHub, Stripe, etc.) and only when verification is enabled.

Use --no-verification if any of the following apply:

You're operating in an air-gapped environment
You don't want to risk lighting up provider audit logs on red-team engagements
You just want fast triage

See THREAT-MODEL.md for intended-use and abuse considerations.

What it does

Requirements

Quick start

Output

All scan options

Environment variables

Exit codes

Detection design

Security & threat model

Comments

Related Posts

G4sp4rCS/CVE-2026-42980-POC: cVE-2026-42980 Public Disclosure

imbas007/POC-CVE-2026-60206: cVE-2026-60206 — Oracle WebLogic SAML Auth Bypass

ZappaBoy/vuln-scanner: automated vulnerability assessment platform that orchestrates 210 open-source

boostedchaos/fleet-cve-scanner: open-source, single-script CVE scanner for RMM-managed fleets