Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 81
jonathanjordan21/mos-mamba-18x130m-trainer-dgx-lora-sft-merged Text Generation • 0.2B • Updated Aug 23, 2024 • 1
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published Sep 12, 2024 • 48
ibm-granite/granite-timeseries-ttm-r2 Time Series Forecasting • 805k • Updated Feb 26, 2025 • 367k • 161
Differentiable Solver Search for Fast Diffusion Sampling Paper • 2505.21114 • Published May 27, 2025 • 13
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 24 days ago • 198
Olaf-World: Orienting Latent Actions for Video World Modeling Paper • 2602.10104 • Published 24 days ago • 27