**Uni-MoE 2.0** is a fully open-source omnimodal model that substantially advances the capabilities of Lychee's Uni-MoE series in language-centric multimodal understanding, reasoning, and generating.
**Uni-MoE 2.0-Base** is the version of the Uni-MoE 2.0 series that supports only all-modality understanding and does not include the audio and image generation modules.
---
**If you enjoy our work or want timely updates, please give us a like and follow us.**
## Open-source Plan
- [x] Model Checkpoint
- [x] [Uni-MoE 2.0-Omni](https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni)
- [x] [Uni-MoE 2.0-Base](https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Base)
- [x] [Uni-MoE 2.0-Thinking](https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Thinking)
- [x] [Uni-MoE 2.0-Image](https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Image)
- [x] [Uni-MoE 2.0-MoE-TTS](https://huggingface.co/HIT-TMG/Uni-MoE-TTS)
- [x] Inference Code: [HITsz-TMG/Uni-MoE-2.0](https://github.com/HITsz-TMG/Uni-MoE/tree/master/Uni-MoE-2)
- [x] Training Code: [HITsz-TMG/Uni-MoE-2.0](https://github.com/HITsz-TMG/Uni-MoE/tree/master/Uni-MoE-2)
- [x] Technical Report: [arxiv](https://arxiv.org/abs/2511.12609)
## Getting Started
### 1. Clone this repository and navigate to the Uni-MoE 2.0 folder
```bash
git clone https://github.com/HITsz-TMG/Uni-MoE.git
cd Uni-MoE-2
```
### 2. Set up environment
Install the evaluation environment according to the requirements.
```bash
conda create -n uni_moe_2 python=3.11
conda activate uni_moe_2
pip install torch==2.5.1 torchaudio==2.5.1 torchvision==0.20.1
pip install -r requirements.txt
pip install flash-attn==2.6.0.post1 --no-build-isolation
pip install clip==1.0@git+https://github.com/openai/CLIP.git@dcba3cb2e2827b402d2701e7e1c7d9fed8a20ef1
```
## Example Usage
We provide a simple example on the usage of this repo. For detailed usage, please refer to [cookbook](https://github.com/HITsz-TMG/Uni-MoE/tree/master/Uni-MoE-2/examples)
```python
import torch
from uni_moe.model.processing_qwen2_vl import Qwen2VLProcessor
from uni_moe.model.modeling_qwen_grin_moe import GrinQwen2VLForConditionalGeneration
from uni_moe.qwen_vl_utils import process_mm_info
from uni_moe.model import deepspeed_moe_inference_utils
processor = Qwen2VLProcessor.from_pretrained("HIT-TMG/Uni-MoE-2.0-Base")
model = GrinQwen2VLForConditionalGeneration.from_pretrained("HIT-TMG/Uni-MoE-2.0-Base", torch_dtype=torch.bfloat16).cuda()
processor.data_args = model.config
messages = [{
"role": "user",
"content": [
{"type": "text", "text": "