Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning Paper • 2601.16163 • Published 3 days ago • 11
Rethinking Video Generation Model for the Embodied World Paper • 2601.15282 • Published 4 days ago • 41
V-DPM: 4D Video Reconstruction with Dynamic Point Maps Paper • 2601.09499 • Published 11 days ago • 9
Inference-time Physics Alignment of Video Generative Models with Latent World Models Paper • 2601.10553 • Published 10 days ago • 12
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 11 days ago • 31
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices Paper • 2601.08303 • Published 12 days ago • 16
DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving Paper • 2601.01528 • Published 21 days ago • 19
Orient Anything V2: Unifying Orientation and Rotation Understanding Paper • 2601.05573 • Published 16 days ago • 9
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published 16 days ago • 16
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 16 days ago • 23
Guiding a Diffusion Transformer with the Internal Dynamics of Itself Paper • 2512.24176 • Published 26 days ago • 8
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection Paper • 2512.23273 • Published 27 days ago • 14
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 27 days ago • 45
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 30 days ago • 60
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 61
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published Dec 17, 2025 • 33
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published Dec 24, 2025 • 22
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published Dec 24, 2025 • 13