RedForge AI is an evidence-first red-teaming framework for LLM applications, RAG systems, AI agents, tool use, memory, and model supply-chain surfaces.
It is built for teams that need more than a list of jailbreak prompts: RedForge runs scoped campaigns, records replayable evidence, evaluates findings, and generates reports that developers and security reviewers can act on.
RedForge is not a C2 framework, not a generic web scanner, and not a guarantee that a system is safe. It is a controlled evaluation harness for AI security work you are authorized to perform.
What it does
- Scoped campaigns: explicit target configuration, allow-listed hosts, attack budgets, and authorization metadata.
- Evidence-first traces: payloads, model responses, retrieved documents, tool calls, memory mutations, side effects, evaluator reasoning, and report references.
- AI-native attack surfaces: prompt injection, jailbreaks, RAG poisoning, tool abuse, memory poisoning, bounded leakage/extraction-resistance checks, configuration drift, and multi-agent trust boundaries.
- Local and service modes: run from the CLI or through a FastAPI control-plane service.
- Extensible architecture: public plugin interfaces, target adapters, attack packs, reports, and schemas.
Quick start
3 steps to first value
make setup # checks Python, uv, and the project venv; prints fix commands if blocked
make demo # runs a scoped local campaign
uv run redforge doctor
open "$(uv run redforge latest-report --path-only --format html)"
First thing to open: the HTML report path printed by make demo. The demo is successful when:
- the terminal prints
HTML_REPORT: artifacts/reports/<campaign_id>.html; - the report shows at least the Results table plus finding cards with payload, trace, raw response, replay command, and fix guidance;
- the summary includes confirmed / suspected / blocked counts and a highest-risk finding;
- the trace exists at
artifacts/traces/<campaign_id>.jsonl.
Requirements
- Python 3.11+
uv
Run the MVP demo
git clone https://github.com/Aimer-zero/redforge-ai.git
cd redforge-ai
make setup
make demo
Or run the same flow through the dedicated smoke script:
./scripts/demo_mvp.sh
The demo initializes local state, runs a scoped campaign against the built-in vulnerable demo agent, records trace evidence, and prints the generated report paths. Look for:
artifacts/reports/<campaign_id>.md
artifacts/reports/<campaign_id>.html
See docs/demo_walkthrough.md for what to inspect after the run.
For larger budgets or slower remote targets, add --progress to print live status while requests are in flight:
uv run redforge run-demo --attack-budget 20 --progress
uv run redforge run-http --url https://example.test/chat --allow-host example.test --attack-budget 20 --progress
Print the latest report path any time:
uv run redforge latest-report --path-only --format html
uv run redforge latest-campaign
uv run redforge campaign-status
Bundle a campaign for handoff or archival:
uv run redforge export-campaign # latest campaign
uv run redforge export-campaign <campaign_id> --out redforge-campaign.zip
Start the API
make api
# then open http://127.0.0.1:8000/health
API smoke test:
make smoke-api
# or
./scripts/smoke_api.sh
For the intentionally vulnerable local agent used by demos:
make agent
Useful local commands
make test # ruff + migration gates + pytest
make build # build all workspace packages
make clean # remove generated artifacts/caches
Campaign artifacts are written to artifacts/ by default:
artifacts/
campaigns/ campaign summaries
traces/ JSONL evidence traces
payloads/ replayable payload files
reports/ Markdown and HTML reports
These files may contain target responses and sensitive evidence from authorized tests. Review before sharing.
Troubleshooting setup
uv is not installed: runcurl -LsSf https://astral.sh/uv/install.sh | sh, restart your shell, thenmake setup.Python version is too old: install Python 3.11+ (for examplebrew install [email protected]) and rerunmake setup.- Broken
.venv: remove it withrm -rf .venv, then rerunmake setup. - Demo ran but you missed the report path:
uv run redforge latest-report --path-only --format html.
Run against a scoped HTTP target
For a complete copy/paste FastAPI example, dry-run plan, target wizard, and adapter validation flow, see examples/custom_http_fastapi/ and docs/custom_http_target.md.
uv run redforge run-http \
--url https://example.test/chat \
--allow-host example.test \
--attack-budget 5
Run against an OpenAI-compatible API
Copy .env.example, then follow docs/openai_compatible_target.md for base URL, model, headers, evidence, and allow-host templates.
OPENAI_API_KEY=sk-... uv run redforge run-openai \
--base-url https://api.openai.com/v1 \
--model gpt-5 \
--allow-host api.openai.com \
--attack-catalog datasets/seed_attacks/default.md
Provider defaults can be configured without code changes:
export REDFORGE_OPENAI_MODEL="gpt-5"
export REDFORGE_ANTHROPIC_MODEL="claude-sonnet-4-6"
export REDFORGE_GEMINI_MODEL="gemini-3.1-pro-preview"
Service mode
uv run uvicorn redforge_api.main:app --host 127.0.0.1 --port 8000
Useful endpoints:
GET /healthPOST /v1/campaigns/local-demo/runPOST /v1/campaigns/openai-compatible/runPOST /v1/campaigns/custom-http/runGET /campaignsGET /campaigns/{campaign_id}GET /campaigns/{campaign_id}/reportGET /campaigns/{campaign_id}/trace?limit=20
Architecture
flowchart LR
CLI["CLI"] --> Control["Control Plane"]
API["FastAPI Service"] --> Control
Worker["Worker"] --> Engine["Campaign Engine"]
Control --> Engine
Engine --> Scope["Scope Guard"]
Engine --> Attacks["Attack Packs"]
Engine --> Targets["Target Adapters"]
Engine --> Eval["Evaluators"]
Targets --> Obs["Observations"]
Obs --> Evidence["Evidence Trace"]
Eval --> Evidence
Evidence --> Reports["Reports"]
Plugins["External Plugins"] --> Attacks
Plugins --> Targets
Workspace packages:
packages/redforge-core core models, scope guard, evidence, metrics, reports
packages/redforge-plugins plugin SDK, registry, loader, capabilities
packages/redforge-attacks-basic community baseline attack pack
packages/redforge-targets target adapters and demo runtime surfaces
packages/redforge-engine campaign orchestration, evaluators, memory, planning
packages/redforge-control projects, findings, live sessions, CI/policy services
packages/redforge-api FastAPI service
packages/redforge-cli CLI entrypoints
packages/redforge-worker worker process
See docs/architecture.md and docs/modular_monolith.md for more detail.
Built-in public attack catalog
The public catalog lives under:
datasets/seed_attacks/default.mddatasets/seed_attacks/default.json
It contains community/basic payloads for authorized AI security evaluation, regression testing, and local demo targets. Real target runs should always use explicit scope settings such as target ID, allowed hosts, allowed suites, allowed tools, attack budget, and authorization metadata.
Project and target registry
Concept map:
flowchart LR
Project["Project (team/application boundary)"] --> Target["Target (chat API, RAG API, agent, or model endpoint)"]
Target --> Campaign["Campaign (scoped attack run)"]
Campaign --> Trace["Trace (payloads, target responses, tool/RAG/memory evidence)"]
Campaign --> Report["Report (executive summary, findings, replay, remediation)"]
Report --> Finding["Finding lifecycle (open, accepted-risk, fixed, false-positive)"]
Create reusable scope configuration:
uv run redforge create-project --project-id demo --name "Demo Project"
uv run redforge register-target \
--project-id demo \
--target-id local \
--target-type local_demo \
--attack-budget 3
uv run redforge run-target --project-id demo --target-id local
Register a multi-agent system:
uv run redforge register-target \
--project-id demo \
--target-id ma-system \
--target-type multi_agent_system \
--agents-json '[{"agent_id":"planner","role":"planner"},{"agent_id":"executor","role":"executor"}]'
Multi-agent runs can record handoffs, trust-boundary crossings, approval checkpoints, tool authorization decisions, blocked side effects, and replayable evidence traces.
How RedForge compares
RedForge complements existing AI evaluation tools:
- use model scanners for broad provider/model probing;
- use prompt evaluation frameworks for CI gates over expected outputs;
- use model-evaluation frameworks for benchmark-style studies of base model capabilities;
- use RedForge for scoped, replayable, application-level campaigns across prompts, RAG, tools, memory, and agent side effects.
More detail:
Roadmap and contributing
The MVP is runnable today. The next public milestones focus on easier adoption, better reports, richer safe community attack cases, target-adapter examples, and CI/policy workflows.
ROADMAP.mddescribes the post-MVP direction.CONTRIBUTING.mdexplains public contribution boundaries and local validation.SECURITY.mdexplains vulnerability reporting and responsible-use expectations.docs/good_first_issues.mdlists starter tasks for contributors.docs/public_launch_checklist.mdis a pre-launch checklist for sharing the project.docs/extensible_skills_mcp.mdshows how users can install their own skill and MCP manifests.
Project scope
This repository focuses on the reusable open-source runtime: scoped campaign execution, target adapters, evidence traces, reports, plugin interfaces, and community/basic attack packs.
The project is intentionally scoped for authorized AI security evaluation workflows. Destructive automation, unscoped offensive workflows, and environment-specific deployment overlays are not part of this repository.
CI enforces repository hygiene: scripts/check_public_private_split.py blocks local agent instructions, generated evidence, workstation paths, secrets, and out-of-scope implementation references from tracked files.
Appropriate use
| Use RedForge for | Do not use RedForge for |
|---|---|
| Authorized AI security evaluation of systems you own or operate | Unscoped scanning of third-party systems |
| Scoped campaigns with allow-listed targets and bounded budgets | C2, exploitation, persistence, or access-control bypass outside authorization |
| Evidence-first reports, replay, CI regression, and remediation validation | Claims that a system is absolutely safe |
| Team triage for prompt injection, RAG, tool, memory, and agent risks | Uploading sensitive target evidence without review |
Compliance-friendly pattern: define written authorization, register the target, set --allow-host, run a bounded campaign, review artifacts locally, triage findings, then rerun the same cases to verify fixes.
Development
uv run ruff check .
uv run python scripts/check_migration_gates.py
uv run pytest -q
Project status
RedForge is in early preview. Public APIs, schemas, and CLI commands may change before a stable release. The current repository is designed to be runnable from source and suitable for local experiments, demos, and authorized evaluation workflows.
Community
- LINUX DO — developer and open-source community.
Responsible use
RedForge AI is a dual-use security evaluation tool. Use it only for systems you own, operate, or have explicit written authorization to test. Do not use RedForge to attack third-party systems, bypass access controls, exfiltrate data, or perform destructive actions outside an approved scope.
RedForge never reports that a system is absolutely safe. When no confirmed finding is discovered, it uses this language:
No confirmed finding was discovered under the current scope, budget, attack strategy, and coverage.
Campaign presets and one-file config
Start with presets instead of selecting suites manually:
uv run redforge run-http --preset smoke # CI default, ~3 requests, first connectivity/risk check
uv run redforge run-http --preset standard # regular pre-release review, ~8 requests
uv run redforge run-http --preset deep # scheduled/manual broad coverage, ~20 requests
For repeatable team runs, use one file:
uv run redforge run --config redforge.example.yml
The config contains target settings, scope, attack budget, suites, CI gate, report options, and artifact retention hints.
Local report viewer
uv run redforge report serve --open
uv run redforge report open latest
The viewer supports latest campaign loading, finding filtering, trace viewing, and replay command copy buttons.
Comments