Controllable Agent is a Python framework for building AI agents that handle complex tasks through multi-agent collaboration, structured memory, and self-improvement mechanisms. Hosted at Yang1999code/controllable-agent with 25 GitHub stars, it addresses limitations in single-model agents, which often lose context or fail on multi-step projects like implementing a user authentication system from scratch. Instead of one AI grinding through everything sequentially, this framework organizes five specialized agent roles into a team that plans, codes, reviews, coordinates, and memorizes in parallel.
The project uses a three-layer architecture with strict one-way dependencies: the ai/ layer defines core types like Message, Tool, and Context with zero dependencies; agent/ implements 20 interfaces for logic like loops, tools, memory, and hooks; and app/ provides CLI, 15 built-in tools, model adapters, configuration, and a TUI interface. This design emphasizes interfaces before implementations (using ABC/Protocol), built-in safety like tool budgets and API retries, and evolution through crystallized skills from successful tasks rather than training.
Multi-agent collaboration
The framework divides work among five roles, mimicking a software development team:
| Role | Responsibility |
|---|---|
| Coordinator | Oversees creation, permissions, and unblocks agents |
| Planner | Breaks tasks into subtasks and adjusts plans dynamically, retaining user intent |
| Coder | Writes code, runs tests per plan |
| Reviewer | Pairs with Coder for module-level checks and fixes during development |
| Memorizer | Extracts and stores reusable experiences |
Coder-Reviewer pairs handle small modules iteratively with unlimited intra-pair revisions, while overall integration tests cap at three rounds before escalating to the user. Parallel pairs scale with task complexity, controlled by semaphores for concurrency. Users can interrupt anytime; the Coordinator triages inputs by urgency without halting the main flow. Nesting limits to two levels prevent infinite delegation.
For a task like "implement a user authentication module," the Planner splits it into JWT handling, routing, and middleware. Pairs work in parallel: one succeeds on JWT immediately, another debugs routing before passing, and middleware clears review. The Coordinator monitors, followed by integration tests and memorization.
Wiki-style memory system
Memory avoids chaotic chat logs by structuring data like Wikipedia entries. After each task:
- Extracts a digest: title plus 3-5 key points.
- Accumulates five similar digests into a wiki page.
- Queries prioritize wiki (broad coverage), then digests (details); absent info means no hallucination.
Storage uses Markdown with YAML frontmatter for readability. Four domains separate conversation history, user profiles, agent views, and task states. A lightweight model handles extraction post-task to avoid burdening the main agent. Each agent maintains isolated memory with a shared exchange area.
Interfaces and tools
Twenty interfaces form the backbone, with versions indicating maturity:
- V1: ITool (concurrent-safe with budgets), IModelProvider (OpenAI/Anthropic compatible streaming), IMemoryBackend (L0-L4 layers, jieba keyword search), IHook (22 events), ISkill/ISkillConfig (YAML-loaded matching).
- V2+: Upcoming like IPluginAdapter, IPromptBuilder (token-aware), IWebAutomation (httpx/Playwright).
Fifteen built-in tools cover file operations (read, write, edit, glob, grep), shell (bash), web (web_fetch, web_search, six web_browser_*), delegation (delegate_task), communication (agent_message, cross_agent_read), and memory/skills (memory_store, memory_search, skill_lookup).
Models include any OpenAI-compatible provider: DeepSeek, Qwen, Zhipu, OpenAI, Anthropic.
Getting it running
Requires Python 3.12+. Use conda or venv for isolation.
git clone https://github.com/Yang1999code/controllable-agent.git
cd controllable-agent
pip install -e .
Set up API keys by copying the config template:
cp app/
(README notes this step; full path likely app/config.yaml.example or similar—check repo for details.) Edit the config for your model provider. Run via CLI or TUI for terminal visualization. Detailed designs in 多智能体设计.md and 我的记忆改进.md.
Use cases
This suits developers tackling intricate, multi-step coding projects where context retention and error correction matter, such as building authentication systems, web apps, or scripts requiring file I/O, web scraping, and testing. The real-time terminal view and interruption support aid oversight during experimentation. AI researchers can extend interfaces for custom agents, memory backends, or web automation.
Comparisons to other agents
Unlike single-agent frameworks that process tasks linearly and forget prior steps, Controllable Agent parallelizes via role pairs and enforces reviews, reducing bugs on modular work. Its memory outperforms simple chat histories by structuring into searchable wiki pages, curbing hallucinations. Compared to heavier multi-agent systems, the 20-interface design stays lightweight with no hard limits on parallelism (semaphore-gated). Early-stage V2+ features lag behind mature tools like AutoGen or CrewAI in plugin ecosystems, but the focus on controllability and evolution sets it apart for Python users needing fine-grained oversight.
At 25 stars, this remains an experimental framework best for tinkerers prototyping team-based agents rather than production deployments demanding battle-tested stability. Source code and docs at https://github.com/Yang1999code/controllable-agent.
Comments