DiT4DiT

1 posts with this tag

DiT4DiT: This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM

DiT4DiT: This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM

DiT4DiT is an end-to-end Vision-Action-Model (VAM) that jointly processes visual dynamics and physical action sequences to predict robot behavior directly from video. It supports both tabletop manipulation and full humanoid robot control, achieving real-time whole-body control without per-task retra

Administrator 4/27/2026