Z-Image SDNQ uint4 + SVD-r32
This is a 4-bit quantized version of Tongyi-MAI/Z-Image using SDNQ (Structured Decomposable Neural Quantization) with SVD decomposition.
Model Details
- Base Model: Tongyi-MAI/Z-Image
- Quantization Method: SDNQ uint4 + SVD
- SVD Rank: 32
- Model Size: ~2GB (reduced from ~8GB)
- Precision: 4-bit weights with int8 matmul
- Performance: Minimal quality loss compared to full precision
Quantization Specifications
weights_dtype = "uint4" # 4-bit weights
use_svd = True # SVD decomposition enabled
svd_rank = 32 # SVD rank for quality preservation
quantized_matmul_dtype = "int8" # Compute precision
group_size = 0 # Auto group size
Usage
Basic Text-to-Image Generation
import torch
from diffusers import DiffusionPipeline
from sdnq.loader import apply_sdnq_options_to_model
# Load the quantized model
pipe = DiffusionPipeline.from_pretrained(
"YOUR_USERNAME/Z-Image-SDNQ-uint4-svd-r32",
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
# Apply SDNQ configuration
pipe.transformer = apply_sdnq_options_to_model(
pipe.transformer,
use_quantized_matmul=False # Set to False for Windows/No-Triton
)
pipe.text_encoder = apply_sdnq_options_to_model(
pipe.text_encoder,
use_quantized_matmul=False
)
pipe = pipe.to("cuda")
# Generate image
image = pipe(
prompt="a beautiful landscape with mountains and a lake at sunset",
num_inference_steps=30,
guidance_scale=1.0
).images[0]
image.save("output.png")
9:16 Portrait Generation (Maximum Resolution)
import torch
from diffusers import DiffusionPipeline
from sdnq.loader import apply_sdnq_options_to_model
# Configuration
WIDTH = 768
HEIGHT = 1344 # 9:16 aspect ratio
PROMPT = "a beautiful landscape with mountains and a lake at sunset, highly detailed, 8k, masterpiece"
STEPS = 30
GUIDANCE = 1.0
# Load model
pipe = DiffusionPipeline.from_pretrained(
"YOUR_USERNAME/Z-Image-SDNQ-uint4-svd-r32",
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
# Configure SDNQ
pipe.transformer = apply_sdnq_options_to_model(
pipe.transformer,
use_quantized_matmul=False
)
pipe.text_encoder = apply_sdnq_options_to_model(
pipe.text_encoder,
use_quantized_matmul=False
)
pipe = pipe.to("cuda")
# Generate
image = pipe(
prompt=PROMPT,
num_inference_steps=STEPS,
guidance_scale=GUIDANCE,
width=WIDTH,
height=HEIGHT
).images[0]
image.save("portrait_9x16.png")
Recommended Settings
| Aspect Ratio | Resolution | Steps | Guidance Scale |
|---|---|---|---|
| 1:1 (Square) | 768x768 | 30 | 1.0 |
| 16:9 (Landscape) | 1344x768 | 30 | 1.0 |
| 9:16 (Portrait) | 768x1344 | 30 | 1.0 |
| 4:3 | 1024x768 | 30 | 1.0 |
Note: For maximum quality, use 30 steps. For faster generation (lower quality), you can use 4-6 steps.
Requirements
pip install torch diffusers transformers sdnq
System Requirements
- GPU: NVIDIA GPU with 8GB+ VRAM recommended
- CUDA: 11.8 or higher
- RAM: 16GB+ system RAM
- Disk: ~2GB for model storage
Performance
- VRAM Usage: ~4-6GB (depending on resolution)
- Generation Speed: ~5-10 seconds per image (30 steps, 768x1344)
- Quality: Near-identical to full precision model
Quantization Method
This model uses SDNQ (Structured Decomposable Neural Quantization) which:
- Reduces model size by ~75% (8GB โ 2GB)
- Maintains high image quality through SVD decomposition
- Enables faster inference on consumer GPUs
- Supports both int-based and float-based quantization schemes
How It Was Quantized
from diffusers import DiffusionPipeline
from sdnq.loader import sdnq_post_load_quant, save_sdnq_model
# Load base model
pipe = DiffusionPipeline.from_pretrained(
"Tongyi-MAI/Z-Image",
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
# Apply quantization to transformer
quantized_transformer = sdnq_post_load_quant(
pipe.transformer,
weights_dtype="uint4",
use_svd=True,
svd_rank=32,
quantized_matmul_dtype="int8",
group_size=0
)
pipe.transformer = quantized_transformer
# Save
save_sdnq_model(pipe, "./Z-Image-SDNQ-uint4-svd-r32", is_pipeline=True)
Limitations
- Requires SDNQ library (
pip install sdnq) - Best results with NVIDIA GPUs (CPU inference not recommended)
- Some quality trade-off compared to full precision (minimal in most cases)
Citation
If you use this model, please cite the original Z-Image paper:
@article{zimage2024,
title={Z-Image: Efficient Text-to-Image Synthesis},
author={Tongyi-MAI Team},
year={2024}
}
License
This model inherits the Apache 2.0 license from the base Tongyi-MAI/Z-Image model.
Acknowledgments
- Base Model: Tongyi-MAI/Z-Image
- Quantization: SDNQ library
- Community support and testing
Additional Resources
- Downloads last month
- -
Model tree for Abrahamm3r/Z-Image-SDNQ-uint4-svd-r32
Base model
Tongyi-MAI/Z-Image