CAP-X: The Robot Manipulation Benchmark for Coding Agents
CAP-X is a robot manipulation-specific benchmark that tests whether coding agents truly understand physics, sensors, and sequential planning—not just Python syntax. Built for real robots like Panda and UR5e, it bridges the gap between code generation and embodied action.