robotics

2 posts with this tag

DiT4DiT: This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM

DiT4DiT: This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM

DiT4DiT is an end-to-end Vision-Action-Model (VAM) that jointly processes visual dynamics and physical action sequences to predict robot behavior directly from video. It supports both tabletop manipulation and full humanoid robot control, achieving real-time whole-body control without per-task retra

Administrator 4/27/2026
CAP-X: The Robot Manipulation Benchmark for Coding Agents

CAP-X: The Robot Manipulation Benchmark for Coding Agents

CAP-X is a robot manipulation-specific benchmark that tests whether coding agents truly understand physics, sensors, and sequential planning—not just Python syntax. Built for real robots like Panda and UR5e, it bridges the gap between code generation and embodied action.

Administrator 4/2/2026