LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 1 day ago • 182
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 15 days ago • 259
Structured Distillation of Web Agent Capabilities Enables Generalization Paper • 2604.07776 • Published 15 days ago • 21
VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification Paper • 2604.01569 • Published 22 days ago • 13
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published 24 days ago • 340
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 29 days ago • 26
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 339