Mistral Medium 3.5 Collection Our first flaship models handling instruction-following, reasoning, and coding in a single set of opened-weights. • 2 items • Updated 2 days ago • 10
talkie-13b Collection talkie-1930-13b is a vintage language model trained on pre-1931 English-language text. See https://github.com/talkie-lm/talkie to run talkie. • 3 items • Updated 11 days ago • 38
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 26 days ago • 112
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated Feb 26 • 96
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers Paper • 2602.03510 • Published Feb 3 • 27
Nemotron ColEmbed V2 Collection State-of-the-Art Late Interaction Vision-Language Embedding Models • 3 items • Updated 11 days ago • 13
PaddleOCR-VL-1.5 Collection Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing • 7 items • Updated Mar 6 • 19
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 154
VulnLLM-R Collection Data and model for VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection • 9 items • Updated Dec 17, 2025 • 9
Kandinsky 5.0 Video Pro Diffusers Collection Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 4 items • Updated Dec 14, 2025 • 13
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 6 items • Updated Feb 24 • 37