CAP-X: The Robot Manipulation Benchmark for Coding Agents
CAP-X is a robot manipulation-specific benchmark that tests whether coding agents truly understand physics, sensors, and sequential planning—not just Python syntax. Built for real robots like Panda an...
Continue Reading