Qwen/Qwen2.5-VL-7B Nunchaku (Text Encoder)
Language: English | 中文
This is a quantized text encoder (text_encoder) artifact for Qwen image generation / image editing workflows. It is exported in a format that can be loaded directly by the Nunchaku runtime, and is intended to replace the text_encoder inside a diffusers pipeline to reduce VRAM usage and improve inference efficiency.
Quantization quality
- hidden_states_last
- rel_l2: 0.2465697987902546
- cosine: 0.969494104385376
- prompt_embeds_trimmed
- rel_l2: 0.2465697987902546
- cosine: 0.969494104385376
Notes
svdq-int4-Qwen2.5vl-Nunchaku.safetensorsis for image editing models such asQwenImageEditPipeline/QwenImageEditPlusPipeline. The vision tower is not quantized.svdq-int4-Qwen2.5vl-text-Nunchaku.safetensorsis for theQwenImagePipelinetext-to-image model. It only quantizes the text encoder.
Install Nunchaku first
Note: As of 2026-04-09, the Nunchaku PR for this functionality has still not been merged into the official main branch. If you want to try it early, you can pull and merge the code from nunchaku-ai/nunchaku#927.
Official installation guide (recommended source of truth):
https://nunchaku.tech/docs/nunchaku/installation/installation.html
Recommended: install the official prebuilt wheel
- Prerequisite: install
PyTorch >= 2.5first (the exact requirement still depends on the selected wheel) - Install the Nunchaku wheel: choose the wheel that matches your environment from GitHub Releases / Hugging Face / ModelScope (
cp311means Python 3.11):https://github.com/nunchaku-ai/nunchaku/releases
# Example: choose the correct wheel URL for your torch/cuda/python version
pip install https://github.com/nunchaku-ai/nunchaku/releases/download/vX.Y.Z/nunchaku-X.Y.Z+torch2.9-cp311-cp311-linux_x86_64.whl
- Tip: this model is INT4 quantized. Follow the official docs and the wheel compatibility matrix when choosing the package for your
torch/cuda/pythonenvironment.
Usage
1. Text-to-image (QwenImagePipeline)
import torch
from diffusers import QwenImagePipeline
from nunchaku import NunchakuQwenEncoderModel
base_model_dir = "/path/to/qwen-image-model" # Your Qwen base model directory or HF model id
model_path = "/path/to/Qwen2.5vl-Nunchaku/svdq-int4-Qwen2.5vl-text-Nunchaku.safetensors" # Use the text variant for QwenImagePipeline
device = "cuda"
torch_dtype = torch.bfloat16 # torch.float16 also works
text_encoder = NunchakuQwenEncoderModel.from_pretrained(
model_path,
device=device,
torch_dtype=torch_dtype,
)
pipe = QwenImagePipeline.from_pretrained(
base_model_dir,
text_encoder=text_encoder,
torch_dtype=torch_dtype,
)
pipe.to(device)
2. Image editing (QwenImageEditPlusPipeline)
import torch
from diffusers import QwenImageEditPlusPipeline
from diffusers.utils import load_image
from nunchaku import NunchakuQwenEncoderModel
base_model_dir = "/path/to/qwen-image-edit-model"
model_path = "/path/to/Qwen2.5vl-Nunchaku/svdq-int4-Qwen2.5vl-Nunchaku.safetensors" # Use the non-text variant for editing models
device = "cuda"
torch_dtype = torch.bfloat16
text_encoder = NunchakuQwenEncoderModel.from_pretrained(
model_path,
device=device,
torch_dtype=torch_dtype,
)
pipe = QwenImageEditPlusPipeline.from_pretrained(
base_model_dir,
text_encoder=text_encoder,
torch_dtype=torch_dtype,
).to(device)
image = load_image("https://example.com/your_image.png").convert("RGB")
result = pipe(
prompt="Turn the cat in the image into one wearing a wizard hat",
image=image,
).images[0]
result.save("qwen-image-edit-plus.png")
Recommended environment
- Python: (\ge 3.11)
- PyTorch: (\ge 2.9) (CUDA environment recommended)
- transformers: 5.3
- diffusers: 0.37
nunchaku: runtime package providingNunchakuQwenEncoderModel(see installation notes above)
- Downloads last month
- 80
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support