RedForge AI is an evidence-first red-teaming framework for LLM applications, RAG systems, AI agents, tool use, memory, and model supply-chain surfaces.

It is built for teams that need more than a list of jailbreak prompts: RedForge runs scoped campaigns, records replayable evidence, evaluates findings, and generates reports that developers and security reviewers can act on.

RedForge is not a C2 framework, not a generic web scanner, and not a guarantee that a system is safe. It is a controlled evaluation harness for AI security work you are authorized to perform.

What it does

  • Scoped campaigns: explicit target configuration, allow-listed hosts, attack budgets, and authorization metadata.
  • Evidence-first traces: payloads, model responses, retrieved documents, tool calls, memory mutations, side effects, evaluator reasoning, and report references.
  • AI-native attack surfaces: prompt injection, jailbreaks, RAG poisoning, tool abuse, memory poisoning, bounded leakage/extraction-resistance checks, configuration drift, and multi-agent trust boundaries.
  • Local and service modes: run from the CLI or through a FastAPI control-plane service.
  • Extensible architecture: public plugin interfaces, target adapters, attack packs, reports, and schemas.

Quick start

3 steps to first value

make setup   # checks Python, uv, and the project venv; prints fix commands if blocked
make demo    # runs a scoped local campaign
uv run redforge doctor
open "$(uv run redforge latest-report --path-only --format html)"

First thing to open: the HTML report path printed by make demo. The demo is successful when:

  • the terminal prints HTML_REPORT: artifacts/reports/<campaign_id>.html;
  • the report shows at least the Results table plus finding cards with payload, trace, raw response, replay command, and fix guidance;
  • the summary includes confirmed / suspected / blocked counts and a highest-risk finding;
  • the trace exists at artifacts/traces/<campaign_id>.jsonl.

Requirements

  • Python 3.11+
  • uv

Run the MVP demo

git clone https://github.com/Aimer-zero/redforge-ai.git
cd redforge-ai
make setup
make demo

Or run the same flow through the dedicated smoke script:

./scripts/demo_mvp.sh

The demo initializes local state, runs a scoped campaign against the built-in vulnerable demo agent, records trace evidence, and prints the generated report paths. Look for:

artifacts/reports/<campaign_id>.md
artifacts/reports/<campaign_id>.html

See docs/demo_walkthrough.md for what to inspect after the run.

For larger budgets or slower remote targets, add --progress to print live status while requests are in flight:

uv run redforge run-demo --attack-budget 20 --progress
uv run redforge run-http --url https://example.test/chat --allow-host example.test --attack-budget 20 --progress

Print the latest report path any time:

uv run redforge latest-report --path-only --format html
uv run redforge latest-campaign
uv run redforge campaign-status

Bundle a campaign for handoff or archival:

uv run redforge export-campaign            # latest campaign
uv run redforge export-campaign <campaign_id> --out redforge-campaign.zip

Start the API

make api
# then open http://127.0.0.1:8000/health

API smoke test:

make smoke-api
# or
./scripts/smoke_api.sh

For the intentionally vulnerable local agent used by demos:

make agent

Useful local commands

make test        # ruff + migration gates + pytest
make build       # build all workspace packages
make clean       # remove generated artifacts/caches

Campaign artifacts are written to artifacts/ by default:

artifacts/
  campaigns/   campaign summaries
  traces/      JSONL evidence traces
  payloads/    replayable payload files
  reports/     Markdown and HTML reports

These files may contain target responses and sensitive evidence from authorized tests. Review before sharing.

Troubleshooting setup

  • uv is not installed: run curl -LsSf https://astral.sh/uv/install.sh | sh, restart your shell, then make setup.
  • Python version is too old: install Python 3.11+ (for example brew install [email protected]) and rerun make setup.
  • Broken .venv: remove it with rm -rf .venv, then rerun make setup.
  • Demo ran but you missed the report path: uv run redforge latest-report --path-only --format html.

Run against a scoped HTTP target

For a complete copy/paste FastAPI example, dry-run plan, target wizard, and adapter validation flow, see examples/custom_http_fastapi/ and docs/custom_http_target.md.

uv run redforge run-http \
  --url https://example.test/chat \
  --allow-host example.test \
  --attack-budget 5

Run against an OpenAI-compatible API

Copy .env.example, then follow docs/openai_compatible_target.md for base URL, model, headers, evidence, and allow-host templates.

OPENAI_API_KEY=sk-... uv run redforge run-openai \
  --base-url https://api.openai.com/v1 \
  --model gpt-5 \
  --allow-host api.openai.com \
  --attack-catalog datasets/seed_attacks/default.md

Provider defaults can be configured without code changes:

export REDFORGE_OPENAI_MODEL="gpt-5"
export REDFORGE_ANTHROPIC_MODEL="claude-sonnet-4-6"
export REDFORGE_GEMINI_MODEL="gemini-3.1-pro-preview"

Service mode

uv run uvicorn redforge_api.main:app --host 127.0.0.1 --port 8000

Useful endpoints:

  • GET /health
  • POST /v1/campaigns/local-demo/run
  • POST /v1/campaigns/openai-compatible/run
  • POST /v1/campaigns/custom-http/run
  • GET /campaigns
  • GET /campaigns/{campaign_id}
  • GET /campaigns/{campaign_id}/report
  • GET /campaigns/{campaign_id}/trace?limit=20

Architecture

flowchart LR
  CLI["CLI"] --> Control["Control Plane"]
  API["FastAPI Service"] --> Control
  Worker["Worker"] --> Engine["Campaign Engine"]
  Control --> Engine
  Engine --> Scope["Scope Guard"]
  Engine --> Attacks["Attack Packs"]
  Engine --> Targets["Target Adapters"]
  Engine --> Eval["Evaluators"]
  Targets --> Obs["Observations"]
  Obs --> Evidence["Evidence Trace"]
  Eval --> Evidence
  Evidence --> Reports["Reports"]
  Plugins["External Plugins"] --> Attacks
  Plugins --> Targets

Workspace packages:

packages/redforge-core           core models, scope guard, evidence, metrics, reports
packages/redforge-plugins        plugin SDK, registry, loader, capabilities
packages/redforge-attacks-basic  community baseline attack pack
packages/redforge-targets        target adapters and demo runtime surfaces
packages/redforge-engine         campaign orchestration, evaluators, memory, planning
packages/redforge-control        projects, findings, live sessions, CI/policy services
packages/redforge-api            FastAPI service
packages/redforge-cli            CLI entrypoints
packages/redforge-worker         worker process

See docs/architecture.md and docs/modular_monolith.md for more detail.

Built-in public attack catalog

The public catalog lives under:

  • datasets/seed_attacks/default.md
  • datasets/seed_attacks/default.json

It contains community/basic payloads for authorized AI security evaluation, regression testing, and local demo targets. Real target runs should always use explicit scope settings such as target ID, allowed hosts, allowed suites, allowed tools, attack budget, and authorization metadata.

Project and target registry

Concept map:

flowchart LR
  Project["Project (team/application boundary)"] --> Target["Target (chat API, RAG API, agent, or model endpoint)"]
  Target --> Campaign["Campaign (scoped attack run)"]
  Campaign --> Trace["Trace (payloads, target responses, tool/RAG/memory evidence)"]
  Campaign --> Report["Report (executive summary, findings, replay, remediation)"]
  Report --> Finding["Finding lifecycle (open, accepted-risk, fixed, false-positive)"]

Create reusable scope configuration:

uv run redforge create-project --project-id demo --name "Demo Project"
uv run redforge register-target \
  --project-id demo \
  --target-id local \
  --target-type local_demo \
  --attack-budget 3
uv run redforge run-target --project-id demo --target-id local

Register a multi-agent system:

uv run redforge register-target \
  --project-id demo \
  --target-id ma-system \
  --target-type multi_agent_system \
  --agents-json '[{"agent_id":"planner","role":"planner"},{"agent_id":"executor","role":"executor"}]'

Multi-agent runs can record handoffs, trust-boundary crossings, approval checkpoints, tool authorization decisions, blocked side effects, and replayable evidence traces.

How RedForge compares

RedForge complements existing AI evaluation tools:

  • use model scanners for broad provider/model probing;
  • use prompt evaluation frameworks for CI gates over expected outputs;
  • use model-evaluation frameworks for benchmark-style studies of base model capabilities;
  • use RedForge for scoped, replayable, application-level campaigns across prompts, RAG, tools, memory, and agent side effects.

More detail:

Roadmap and contributing

The MVP is runnable today. The next public milestones focus on easier adoption, better reports, richer safe community attack cases, target-adapter examples, and CI/policy workflows.

Project scope

This repository focuses on the reusable open-source runtime: scoped campaign execution, target adapters, evidence traces, reports, plugin interfaces, and community/basic attack packs.

The project is intentionally scoped for authorized AI security evaluation workflows. Destructive automation, unscoped offensive workflows, and environment-specific deployment overlays are not part of this repository.

CI enforces repository hygiene: scripts/check_public_private_split.py blocks local agent instructions, generated evidence, workstation paths, secrets, and out-of-scope implementation references from tracked files.

Appropriate use

Use RedForge for Do not use RedForge for
Authorized AI security evaluation of systems you own or operate Unscoped scanning of third-party systems
Scoped campaigns with allow-listed targets and bounded budgets C2, exploitation, persistence, or access-control bypass outside authorization
Evidence-first reports, replay, CI regression, and remediation validation Claims that a system is absolutely safe
Team triage for prompt injection, RAG, tool, memory, and agent risks Uploading sensitive target evidence without review

Compliance-friendly pattern: define written authorization, register the target, set --allow-host, run a bounded campaign, review artifacts locally, triage findings, then rerun the same cases to verify fixes.

Development

uv run ruff check .
uv run python scripts/check_migration_gates.py
uv run pytest -q

Project status

RedForge is in early preview. Public APIs, schemas, and CLI commands may change before a stable release. The current repository is designed to be runnable from source and suitable for local experiments, demos, and authorized evaluation workflows.

Community

  • LINUX DO — developer and open-source community.

Responsible use

RedForge AI is a dual-use security evaluation tool. Use it only for systems you own, operate, or have explicit written authorization to test. Do not use RedForge to attack third-party systems, bypass access controls, exfiltrate data, or perform destructive actions outside an approved scope.

RedForge never reports that a system is absolutely safe. When no confirmed finding is discovered, it uses this language:

No confirmed finding was discovered under the current scope, budget, attack strategy, and coverage.

Campaign presets and one-file config

Start with presets instead of selecting suites manually:

uv run redforge run-http --preset smoke    # CI default, ~3 requests, first connectivity/risk check
uv run redforge run-http --preset standard # regular pre-release review, ~8 requests
uv run redforge run-http --preset deep     # scheduled/manual broad coverage, ~20 requests

For repeatable team runs, use one file:

uv run redforge run --config redforge.example.yml

The config contains target settings, scope, attack budget, suites, CI gate, report options, and artifact retention hints.

Local report viewer

uv run redforge report serve --open
uv run redforge report open latest

The viewer supports latest campaign loading, finding filtering, trace viewing, and replay command copy buttons.