Train. Break. Defend. AI Systems.

An open-source platform for AI security training, red/blue teaming, CTF, benchmarking, and research. Runs 100% locally. No cloud, no paid APIs, no data leaves your machine.


Prerequisites

  • Docker 24+ and Docker Compose v2
  • 16 GB RAM minimum (Ollama loads models into memory; 8 GB works for small models only)
  • 20 GB free disk space for base images and at least one model
  • x86-64 CPU; Apple Silicon and ARM64 are supported but untested

GPU is optional. Ollama runs on CPU but inference will be slow without one.


Quick Start

git clone https://github.com/sonuoffsec/DVAP
cd DVAP
cp .env.example .env
docker compose up -d

Open http://localhost:8080 once all containers are healthy. First run takes 30-60 seconds.


What is DVAP?

DVAP is an open-source AI security research, training, benchmarking, and red teaming platform designed to help security professionals, AI engineers, researchers, students, and organizations understand how modern AI systems fail and how to defend them.

Built for the AI era, DVAP provides intentionally vulnerable AI applications, agents, RAG systems, MCP integrations, and domain-specific environments that can be attacked, analyzed, benchmarked, and secured.

Unlike cloud-based AI playgrounds, DVAP runs entirely on your machine.

No cloud. No subscriptions. No API costs. No vendor lock-in.


Why DVAP?

Modern AI applications introduce entirely new attack surfaces:

  • Prompt Injection
  • Memory Poisoning
  • RAG Poisoning
  • Tool Abuse
  • MCP Exploitation
  • Multi-Agent Attacks
  • Autonomous Agent Manipulation
  • Data Exfiltration
  • Identity and Trust Failures
  • AI Supply Chain Risks

Yet there is no single platform that allows researchers to safely learn, practice, benchmark, and validate these attacks in one place.

DVAP aims to become the definitive open-source platform for AI security education, research, and experimentation.


How DVAP Differs

DVAP DVWA HackTheBox Gandalf (Lakera) Blog Posts / Papers
AI-specific vulnerabilities Yes No Partial Partial Yes (theory only)
Local, no cloud Yes Yes No No N/A
15 dedicated AI labs Yes No No No No
LLM benchmark engine Yes No No No No
CTF with flags Yes Yes Yes No No
OWASP LLM Top 10 coverage Full No Partial Partial Varies
MITRE ATLAS mapping Yes No No No Varies
Report generation Yes No No No No
Research workspace Yes No No No No
Agent and MCP security Yes No No No No
Free and open source Yes Yes Partial No Yes

DVAP is one of the first platforms to combine hands-on AI attack labs, local LLM benchmarking, CTF challenges, and professional reporting in a single self-hosted environment.


Key Features

AI Security Labs 15 intentionally vulnerable labs covering real-world AI attack techniques.

Research Workspace Inspect prompts, memory, tool calls, retrieved documents, agent actions, and attack chains.

Security Benchmarking Evaluate local and external models against AI security attack suites.

Capture The Flag (CTF) Learn AI security through guided challenges, flags, hints, and walkthroughs.

Reporting Engine Generate professional findings and benchmark reports mapped to OWASP LLM Top 10, MITRE ATLAS, CWE, and CVSS.

100% Local Run everything on your own machine. Your prompts, data, findings, and experiments never leave your environment.


Platform Overview

DVAP Platform Overview


๐ŸŽฌ Demo

DVAP Demo โ€” Train. Break. Defend. AI Systems.

โ–ถ Train. Break. Defend. AI Systems.  ยท  Click to watch the full attack walkthrough


๐Ÿ“ธ Screenshots


๐Ÿ–ฅ๏ธ Research Command Center

DVAP Research Command Center

Research Command Center  ยท  Live platform activity, model benchmark leaderboard, flag capture progress, and service health in a single operational view.


๐Ÿงฉ Platform Modules


AI Security Labs

โš—๏ธ AI Security Labs
15 containerized vulnerable AI environments across every major attack class
 
CTF Challenges

๐Ÿšฉ CTF Challenges
Flags, hints, and walkthroughs mapped to OWASP LLM Top 10 and MITRE ATLAS
 
Benchmark Center

๐Ÿ“Š Benchmark Center
Evaluate local LLMs against prompt injection, jailbreak, and data exfiltration suites
 
Research Workspace

๐Ÿ”ฌ Research Workspace
Full trace recording of prompts, memory, tool calls, and agent behavior
 


๐Ÿ›ก๏ธ Findings Dashboard

DVAP Findings Dashboard

Findings Dashboard  ยท  Track, triage, and document security findings with severity ratings, OWASP LLM Top 10 mapping, MITRE ATLAS techniques, and export-ready reports.

Labs

15 containerized labs, each with flags, hints, walkthrough, and OWASP LLM Top 10 + MITRE ATLAS mapping.

Lab Difficulty OWASP LLM MITRE ATLAS
Prompt Injection Beginner LLM01 AML.T0051, AML.T0054
Memory Poisoning Intermediate LLM02 AML.T0054
RAG Poisoning Intermediate LLM02, LLM03 AML.T0020, AML.T0043
Tool Output Injection Intermediate LLM07 AML.T0054, AML.T0068
MCP Security Advanced LLM07 AML.T0068
Browser Agent Security Advanced LLM07, LLM09 AML.T0054
Multi-Agent Security Advanced LLM08 AML.T0054
Autonomous Agent Security Advanced LLM08, LLM09 AML.T0054
Data Exfiltration Advanced LLM06 AML.T0057, AML.T0058
Agent Identity and Trust Abuse Advanced LLM08 AML.T0058
AI Banking Platform Intermediate LLM01, LLM06 AML.T0043
AI Healthcare Environment Advanced LLM01, LLM06 AML.T0043
Multi-Tenant AI SaaS Advanced LLM06 AML.T0043
AI Supply Chain Security Expert LLM03, LLM05 AML.T0010, AML.T0048
AI Developer Platform Expert LLM03, LLM07 AML.T0010, AML.T0068

Each lab runs in an isolated Docker container with its own Ollama-backed LLM endpoint.


OWASP LLM Top 10 Coverage

DVAP is aligned with the OWASP Top 10 for Large Language Model Applications. Each category maps to one or more dedicated labs.

Category Name Labs Status
LLM01 Prompt Injection Prompt Injection, AI Banking Platform, AI Healthcare Environment Covered
LLM02 Data and Memory Poisoning Memory Poisoning, RAG Poisoning Covered
LLM03 Supply Chain and Training Data Risks RAG Poisoning, AI Supply Chain Security, AI Developer Platform Covered
LLM04 Model Denial of Service Planned (v1.2)
LLM05 Insecure Supply Chain AI Supply Chain Security Covered
LLM06 Sensitive Information Disclosure Data Exfiltration, AI Banking Platform, AI Healthcare Environment, Multi-Tenant AI SaaS Covered
LLM07 Insecure Plugin Design Tool Output Injection, MCP Security, Browser Agent Security, AI Developer Platform Covered
LLM08 Excessive Agency Multi-Agent Security, Autonomous Agent Security, Agent Identity and Trust Abuse Covered
LLM09 Overreliance Browser Agent Security, Autonomous Agent Security Covered
LLM10 Model Theft Planned (v1.2)

8 of 10 categories covered across 15 labs. LLM04 and LLM10 are on the v1.2 roadmap.


Architecture

graph TB
    User([User Browser]) --> Nginx[Nginx :8080]
    Nginx --> Web[Next.js Frontend :3000]
    Nginx --> API[FastAPI Backend :8000]

    API --> PG[(PostgreSQL)]
    API --> Redis[(Redis)]
    API --> Qdrant[(Qdrant)]
    API --> Sock[Docker Socket]

    Sock --> L1[Lab Container]
    Sock --> L2[Lab Container]
    Sock --> LN[Lab Container ...]

    L1 --> Ollama[Ollama :11434]
    L2 --> Ollama
    LN --> Ollama

    subgraph dvap-internal network
        Web
        API
        PG
        Redis
        Qdrant
    end

    subgraph dvap-labs network
        L1
        L2
        LN
        Ollama
    end

Lab containers are isolated on a separate Docker network. They can reach Ollama for LLM inference but cannot reach the database, Redis, or Qdrant.


Security Architecture

Network Isolation

Two Docker networks keep lab traffic separate from platform infrastructure:

  • dvap-internal (172.20.0.0/24) - PostgreSQL, Redis, Qdrant, API, frontend, Nginx
  • dvap-labs (172.21.0.0/24) - lab containers and Ollama

Lab containers can reach Ollama and nothing else on the internal network. They cannot reach PostgreSQL, Redis, or Qdrant.

Docker Socket Access

Known tradeoff: The API container mounts /var/run/docker.sock to spawn and stop lab containers on demand (Docker-out-of-Docker). This grants the API process root-equivalent access to the host Docker daemon.

This is an intentional design decision. DVAP is a local single-user install for security research and training, not a multi-tenant service. The tradeoff is accepted because:

  • There is no network-accessible admin interface that could trigger arbitrary container operations
  • Lab container resource limits (512 MB RAM, 0.5 CPU) prevent resource exhaustion
  • Lab images are built from controlled Dockerfiles in this repository

If you are deploying DVAP in a shared or networked environment, replace the socket mount with a rootless Docker socket or Podman socket (/run/user/1000/podman/podman.sock) and restrict API network access accordingly.

Instance TTL

Lab instances stop automatically after 1 hour via Redis TTL keys. Call POST /api/v1/instances/cleanup to trigger early cleanup.

Rate Limiting

Flag submissions are rate-limited to 15 attempts per 60-second window per session token.


Services

Service Port (internal) Purpose
PostgreSQL 5432 Primary datastore
Redis 6379 Rate limiting, instance TTL
Qdrant 6333 Semantic search over findings
Ollama 11434 Local LLM inference
API 8000 FastAPI backend
Web 3000 Next.js frontend
Nginx 8080 (host) Reverse proxy

Environment Variables

See .env.example for all variables. Key ones to change before any networked deployment:

SECRET_KEY=          # strong random value for HMAC signing
POSTGRES_PASSWORD=   # change from the default
REDIS_PASSWORD=      # change from the default

Upgrading

One command brings your install up to date:

make upgrade

This runs git pull then rebuilds and restarts all containers. Database migrations run automatically on every API container start.

Without make:

git pull
docker compose up -d --build

What upgrades automatically

  • Backend code and API endpoints
  • Frontend dashboard
  • Database schema (Alembic migrations)
  • Lab definitions

What is never touched

  • Your data (findings, research sessions, benchmark results, campaigns)
  • Your .env file

Check for new environment variables

diff .env .env.example

Development vs Production

# Development (default) - auto-loads docker-compose.override.yml
# Hot reload for API and frontend, source mounted as volumes
docker compose up -d

# Production - baked images, no volume mounts, 4 uvicorn workers
docker compose -f docker-compose.yml up -d

Build images before the production run:

docker build -t dvap-api:latest --target production ./backend
docker build -t dvap-web:latest --target production ./frontend

Running Tests

Tests require a PostgreSQL instance. Start the stack first:

docker compose up -d postgres
export TEST_DATABASE_URL=postgresql+asyncpg://dvap:<your-postgres-password>@localhost:5432/dvap_test
cd backend
pip install -e ".[dev]"
pytest

Roadmap

v1.1 - Expanded Lab Coverage

  • LangChain agent security lab
  • CrewAI multi-agent security lab
  • LlamaIndex RAG security lab
  • AutoGPT-style autonomous agent lab

v1.2 - Enhanced Benchmarking

  • Support for OpenAI and Anthropic API models
  • Custom benchmark suite builder
  • Benchmark comparison reports across model versions
  • Automated regression testing for model security

v1.3 - Platform Improvements

  • Multi-user support with session isolation
  • VS Code extension for research workspace
  • GitHub Actions integration for CI/CD security testing
  • Lab difficulty progression system

v2.0 - AI Red Team Automation

  • Automated attack chain generation
  • AI-assisted vulnerability discovery
  • Red team campaign templates
  • Integration with popular security tools (Burp Suite, Metasploit)

Want to contribute to the roadmap? Open an issue or start a discussion.


Governance

DVAP is developed and maintained by Sonu Chaudhary.

Community contributions are welcome and governed by the Contributing guidelines. Long-term direction is driven through GitHub Issues and Discussions. Lab additions, feature proposals, and roadmap input are reviewed publicly.

There is no single-point-of-failure risk: the repository is open source under Apache 2.0 and forkable by the community at any time.