Q4 quant

by piloponth - opened 22 days ago

Hi,
I recently found your quants.
I would much appreciate the Q4_XXL for this particular (or any other Qwen3.5-27B) model.
🙏

altomek

Owner 21 days ago

Do you have more then 24GB VRAM available? Because Q4_K_XXL would be 23GB with the same setting as Q6 is done.

piloponth

20 days ago

Do you have more then 24GB VRAM available? Because Q4_K_XXL would be 23GB with the same setting as Q6 is done.

I have exactly 24GB VRAM and 64GB RAM. Q6 is just too much. Q4 would be ideal, as I can tune --n-gpu-layers to utilize my VRAM exactly.
I'm currently in a hunt of some improved Qwen3.5-27B at Q4, and I really like your XXL quants.

altomek

Owner 20 days ago

Deppending on what you need LLM for those quants may be a hit or a miss. I find them better performers when working with context, they feel smarter, however I still have problems finding best recipe for given model.

I made several quants of this model with different settings for attension and ffn_down weights, and find out, in my limited testing, that quant I made one of the first tries, that omitted setting attn_output in higher precission in fact performs better then quants with more precission for this weight made letter... also setting all attension weights to bf16 not nescesserly improved model performance. This is so wired.

But again, my testing is a bit too limited. They sure work for me however :P and if I find better recipe latter, I eventually reupload newer version, so do not be supprised if I remove this repo and reupload later :P

BTW, uploading Q4. It is not that extreme setting attention layers like some my other quants, and maybe setting bf16 is not always the way when making smaller quants... should be fine... or you tell me?

I have slow upload so it will take ~day.

altomek

Owner 20 days ago

BTW2. I don't normally make quants when someone asks me, but I still have full precission weights of this model and still plan to test some recipes to see if I can find better setting for weights... so you are lucky with this request :P

piloponth

13 days ago

•

edited 13 days ago

Thank you very much. Sorry for late response. Downloaded weights and they seem to be better than "raw" Unsloth's ones.

altomek changed discussion status to closed 10 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment