Qwen/Qwen2.5-VL-7B Nunchaku (Text Encoder)

Language: English | 中文

This is a quantized text encoder (text_encoder) artifact for Qwen image generation / image editing workflows. It is exported in a format that can be loaded directly by the Nunchaku runtime, and is intended to replace the text_encoder inside a diffusers pipeline to reduce VRAM usage and improve inference efficiency.

Quantization quality

  • hidden_states_last
    • rel_l2: 0.2465697987902546
    • cosine: 0.969494104385376
  • prompt_embeds_trimmed
    • rel_l2: 0.2465697987902546
    • cosine: 0.969494104385376

Notes

  • svdq-int4-Qwen2.5vl-Nunchaku.safetensors is for image editing models such as QwenImageEditPipeline / QwenImageEditPlusPipeline. The vision tower is not quantized.
  • svdq-int4-Qwen2.5vl-text-Nunchaku.safetensors is for the QwenImagePipeline text-to-image model. It only quantizes the text encoder.

Install Nunchaku first

  • Note: As of 2026-04-09, the Nunchaku PR for this functionality has still not been merged into the official main branch. If you want to try it early, you can pull and merge the code from nunchaku-ai/nunchaku#927.

  • Official installation guide (recommended source of truth): https://nunchaku.tech/docs/nunchaku/installation/installation.html

Recommended: install the official prebuilt wheel

  • Prerequisite: install PyTorch >= 2.5 first (the exact requirement still depends on the selected wheel)
  • Install the Nunchaku wheel: choose the wheel that matches your environment from GitHub Releases / Hugging Face / ModelScope (cp311 means Python 3.11):
    • https://github.com/nunchaku-ai/nunchaku/releases
# Example: choose the correct wheel URL for your torch/cuda/python version
pip install https://github.com/nunchaku-ai/nunchaku/releases/download/vX.Y.Z/nunchaku-X.Y.Z+torch2.9-cp311-cp311-linux_x86_64.whl
  • Tip: this model is INT4 quantized. Follow the official docs and the wheel compatibility matrix when choosing the package for your torch/cuda/python environment.

Usage

1. Text-to-image (QwenImagePipeline)

import torch
from diffusers import QwenImagePipeline
from nunchaku import NunchakuQwenEncoderModel

base_model_dir = "/path/to/qwen-image-model"   # Your Qwen base model directory or HF model id
model_path = "/path/to/Qwen2.5vl-Nunchaku/svdq-int4-Qwen2.5vl-text-Nunchaku.safetensors"  # Use the text variant for QwenImagePipeline
device = "cuda"
torch_dtype = torch.bfloat16  # torch.float16 also works
text_encoder = NunchakuQwenEncoderModel.from_pretrained(
    model_path,
    device=device,
    torch_dtype=torch_dtype,
)

pipe = QwenImagePipeline.from_pretrained(
    base_model_dir,
    text_encoder=text_encoder,
    torch_dtype=torch_dtype,
)
pipe.to(device)

2. Image editing (QwenImageEditPlusPipeline)

import torch
from diffusers import QwenImageEditPlusPipeline
from diffusers.utils import load_image
from nunchaku import NunchakuQwenEncoderModel

base_model_dir = "/path/to/qwen-image-edit-model"
model_path = "/path/to/Qwen2.5vl-Nunchaku/svdq-int4-Qwen2.5vl-Nunchaku.safetensors"  # Use the non-text variant for editing models
device = "cuda"
torch_dtype = torch.bfloat16

text_encoder = NunchakuQwenEncoderModel.from_pretrained(
    model_path,
    device=device,
    torch_dtype=torch_dtype,
)

pipe = QwenImageEditPlusPipeline.from_pretrained(
    base_model_dir,
    text_encoder=text_encoder,
    torch_dtype=torch_dtype,
).to(device)

image = load_image("https://example.com/your_image.png").convert("RGB")
result = pipe(
    prompt="Turn the cat in the image into one wearing a wizard hat",
    image=image,
).images[0]
result.save("qwen-image-edit-plus.png")

Recommended environment

  • Python: (\ge 3.11)
  • PyTorch: (\ge 2.9) (CUDA environment recommended)
  • transformers: 5.3
  • diffusers: 0.37
  • nunchaku: runtime package providing NunchakuQwenEncoderModel (see installation notes above)
Downloads last month
80
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support