71 178

crumb

Druvith's profile picture

euclaise's profile picture

interstellarninja's profile picture

cephaloform
aicrumb
crumb.bsky.social

AI & ML interests

For what I'm working on right now, check out https://hf.co/crumbs-playground (the mammoth image button on my profile)

Recent Activity

updated a model about 18 hours ago

crumbs-playground/clmr3-qwen3.5-2b-warm-start-merged

updated a collection 1 day ago

CLM_R1

updated a collection 1 day ago

CLM_R1

View all activity

Organizations

crumb 's collections 7

embedding

crumb/essence-3b-v2

Feature Extraction • Updated 18 days ago • 3

MoLora-v2

First Prototype of the second iteration of MoLora utilizing mixture of expert techniques applied to the Llama2 model.

crumb/test-00-switchllama-i3b-f10b-e4-init

Text Generation • Updated Sep 13, 2023 • 8
crumb/test-00-qlora-wizmlpmix-c0

Updated Sep 4, 2023 • 2
crumb/test-00-qlora-wizmlpmix-c1

Updated Sep 4, 2023 • 3
crumb/test-00-qlora-wizmlpmix-c3

Updated Sep 4, 2023 • 4

Shrink Llama - V1

Parts of Meta's LlamaV2 models, chopped up and trained. CoreX means the first X layers were kept.

crumb/core1-base-464m-c4

Text Generation • 0.5B • Updated Sep 12, 2023 • 4
crumb/core1-base-464m-redpajama

Text Generation • Updated Sep 12, 2023 • 3

MoAT (More Artificial Tokens)

Allowing for the LM to learn a soft-"multi-step program" to predict future tokens instead of learning to predict future tokens itself.

crumb/16xF-6m-init

Text Generation • Updated Oct 16, 2023 • 13
crumb/32xF-6m-init

Text Generation • Updated Oct 16, 2023 • 16

MoLora-v1

Model assets for the first Mixture-of-Lora technique applied to Llama. https://bit.ly/48bqshl

crumb/llama2-7b-moe-text-exp0-4

Updated Jul 19, 2023 • 5
crumb/llama2-7b-moe-text-exp1-4

Updated Jul 19, 2023 • 4 • 2
crumb/llama2-7b-moe-text-exp2-4

Updated Jul 19, 2023 • 6
crumb/llama2-7b-moe-text-exp3-4

Updated Jul 19, 2023 • 5

GPT2-Linear

GPT2 Models using Linear layers instead of Conv layers for convenience.

crumbly/gpt2-linear-xl

Text Generation • Updated Jul 18, 2023 • 10 • 1
crumbly/gpt2-linear-large

Text Generation • Updated Jul 17, 2023 • 11
crumbly/gpt2-linear-medium

Text Generation • Updated Jul 17, 2023 • 7
crumbly/gpt2-linear-small

Text Generation • Updated Jul 17, 2023 • 5

Cramp(ed) Models

Smaller models trained locally on my 2xA6000 Lambda Vector

crumbly/cramp-25m

Text Generation • Updated Feb 15, 2024 • 4 • 8
crumb/cramped-94m-8btok

Text Generation • Updated Oct 11, 2023 • 6 • 1