Performance and qualityvs. GGUF Q8_0 (and FP8) ?

#6
by raffetazarius - opened

Has anyone tested this out, and/or have comparisons to share?
Curious which performs best on a 5090...

Q8_0 = ~23 GB
https://huggingface.co/unsloth/LTX-2.3-GGUF/blob/main/ltx-2.3-22b-dev-Q8_0.gguf

FP8 = ~29GB
https://huggingface.co/Lightricks/LTX-2.3-fp8/blob/main/ltx-2.3-22b-dev-fp8.safetensors

NVFP4 = ~22GB
https://huggingface.co/Lightricks/LTX-2.3-nvfp4/blob/main/ltx-2.3-22b-dev-nvfp4.safetensors

Preferably without Sage Attention since this affects the model's "pure" output.

Sign up or log in to comment