view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
bartowski/agentica-org_DeepScaleR-1.5B-Preview-GGUF Text Generation • 2B • Updated Feb 11, 2025 • 11.7k • 40
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • 2B • Updated Feb 24, 2025 • 587k • • 1.51k
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf • Sep 18, 2024 • 280
gradientai/Llama-3-8B-Instruct-Gradient-1048k Text Generation • 8B • Updated Oct 29, 2024 • 9.86k • • 681