Region-Constraint In-Context Generation for Instructional Video Editing Paper • 2512.17650 • Published Dec 19, 2025 • 53
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 6 days ago • 32
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper • 2605.05204 • Published 9 days ago • 25
Sherry: Hardware-Efficient 1.25-Bit Ternary Quantization via Fine-grained Sparsification Paper • 2601.07892 • Published Jan 12 • 4
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression Paper • 2602.21233 • Published Feb 7 • 3
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published 15 days ago • 68
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 32 items • Updated 1 day ago • 83
MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated 12 days ago • 55
MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks Paper • 2509.14638 • Published Sep 18, 2025 • 15
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions Paper • 2506.13691 • Published Jun 16, 2025 • 4
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data Paper • 2410.01560 • Published Oct 2, 2024 • 6
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 50
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset Paper • 2402.14804 • Published Feb 22, 2024 • 5
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31, 2024 • 25
WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark Paper • 2604.10988 • Published Apr 13 • 3