Performance and qualityvs. GGUF Q8_0 (and FP8) ?
#6
by raffetazarius - opened
Has anyone tested this out, and/or have comparisons to share?
Curious which performs best on a 5090...
Q8_0 = ~23 GB
https://huggingface.co/unsloth/LTX-2.3-GGUF/blob/main/ltx-2.3-22b-dev-Q8_0.gguf
FP8 = ~29GB
https://huggingface.co/Lightricks/LTX-2.3-fp8/blob/main/ltx-2.3-22b-dev-fp8.safetensors
NVFP4 = ~22GB
https://huggingface.co/Lightricks/LTX-2.3-nvfp4/blob/main/ltx-2.3-22b-dev-nvfp4.safetensors
Preferably without Sage Attention since this affects the model's "pure" output.