OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 4 days ago • 92
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 7 days ago • 145
Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics Paper • 2604.08503 • Published Apr 9 • 7
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 324
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU Paper • 2603.16428 • Published Mar 17 • 51
MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model Paper • 2603.26357 • Published Mar 27 • 4
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 501
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training Paper • 2603.28858 • Published Mar 30 • 9
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350