DistilQwen
H100 BF16. 30B→1.7B/0.6B TKD. Three teachers. 15 models + DISC paper. 10K+ downloads. DOI: 10.57967/hf/8165 & 10.57967/hf/8194
Text Generation • 2B • Updated • 762 •Note First in the DistilQwen chain. Foundation for all downstream models.
reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF
Text Generation • 2B • Updated • 1.9kNote Instruct teacher + SFT quantized. F16/Q4/Q5/Q8 available.
reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT
2B • Updated • 227Note Instruct distillation + SFT. Full precision. BF16 H100.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B
Text Generation • 0.8B • Updated • 775 •Note Base for the Thinking-SFT pipeline at 0.6B.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT
Text Generation • 0.8B • Updated • 804 • • 2Note The smallest model with Thinking teacher signal. 0.8B params.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF
Text Generation • 0.8B • Updated • 1.83kNote mradermacher also auto-quantized this one — 420+ shadow downloads.
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT
Text Generation • 2B • Updated • 772 • • 1Note Different capability profile: hierarchical problem solving.
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF
Text Generation • 2B • Updated • 2.53k • 1Note Structured reasoning for edge deployment. Apache 2.0.
reaperdoesntknow/DistilQwen3-1.7B-uncensored
Text Generation • 2B • Updated • 697 •Note Uncensored base distillation. No alignment filtering.
reaperdoesntknow/TopologicalQwen
Text Generation • 2B • Updated • 798 •Note TKD flagship. BV decomposition → jump detection → curriculum.
reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored
2B • Updated • 216 • 1Note Named for Discrepancy Calculus influence on training signal.
reaperdoesntknow/Disctil-Qwen3-1.7B
Text Generation • 2B • Updated • 684 •Note Structural refinement via DISC operator before TKD stage.
reaperdoesntknow/DistilQwen3-1.7B-uncensored-GGUF
2B • Updated • 2.03k • 1Note Community validated — third-party quantizations exist.
reaperdoesntknow/Qwen3-1.7B-Thinking-Distil
Text Generation • 2B • Updated • 945 • • 1Note Extended deliberation from 30B-Thinking → 1.7B student.
reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT
Text Generation • 1B • Updated • 637Note Proves TKD works across architecture families, not just within Qwen.
reaperdoesntknow/Discrepancy_Calculus
UpdatedNote Continuous Thought Dynamics — mathematical backbone of DualMind.