Meta-CoT: Enhancing Granularity and Generalization in Image Editing Paper • 2604.24625 • Published 4 days ago • 25
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 4 days ago • 113
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published 10 days ago • 86
OneHOI: Unifying Human-Object Interaction Generation and Editing Paper • 2604.14062 • Published 16 days ago • 8
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 16 days ago • 153
NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results Paper • 2506.02875 • Published Jun 3, 2025
Trade-offs in Image Generation: How Do Different Dimensions Interact? Paper • 2507.22100 • Published Jul 29, 2025
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 18 days ago • 70
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 18 days ago • 70
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 18 days ago • 70
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published 26 days ago • 50
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 25 days ago • 203
Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development Paper • 2603.27460 • Published Mar 29 • 68
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published Mar 24 • 62
VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining Paper • 2603.15030 • Published Mar 16 • 21