Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation Paper • 2604.18168 • Published 4 days ago • 94
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published 2 days ago • 7
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published 3 days ago • 80
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 4 days ago • 26
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published 4 days ago • 41
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published 4 days ago • 77
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 9 days ago • 109
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 9 days ago • 151
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 16 days ago • 94
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published 14 days ago • 48
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 15 days ago • 240
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Paper • 2604.08516 • Published 15 days ago • 42
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 123
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 76
Rethinking Video Generation Model for the Embodied World Paper • 2601.15282 • Published Jan 21 • 45
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published Dec 16, 2025 • 73