cyankiwi
/

GLM-4.5-Air-AWQ-4bit

Text Generation

compressed-tensors

Model card Files Files and versions

Resources

View closed (2)

KV Cache per token and Doubling the context size

#12 opened about 2 months ago by

Running on 4 GPUs with TP=4

#11 opened 3 months ago by

Running on 6 GPUs

#10 opened 5 months ago by

Thank you and a couple QQs

#9 opened 6 months ago by

request for fp4 quants

#8 opened 6 months ago by

Improved useability

#7 opened 6 months ago by

Model for 8 gpus

#6 opened 7 months ago by

Any chance for more GLM quants?

#4 opened 7 months ago by

Please make one for the larger Non Air Variant

#2 opened 8 months ago by

chriswritescode

Does this actually work with VLLM?

#1 opened 8 months ago by