| --- |
| library_name: diffusers |
| license: apache-2.0 |
| --- |
| <!-- <p align="center"> |
| <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.png?raw=true" style="width: 100%; max-width: 1100px;"> |
| </p> --> |
|
|
| <p align="center"> |
| <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.svg?raw=true" style="width: 40%; max-width: 1100px;"> |
| </p> |
|
|
|
|
| ## ๐ Update News |
| - **2026-03-05**: Official release of KORMo-Diffusion. |
| - **2026-03-02**: Official release of KORMo-VL. |
| - **2025-10-13**: Official release of KORMo-10B-sft. |
| --- |
| ## ๐ก About KORMo-VL-Diffusion |
|
|
| **KORMo-VL** is a vision-language model developed **from scratch by the KAIST MLP Lab (https://sites.google.com/view/aailab)**, built on top of **KORMo-10B**. |
| The system consists of two components: |
|
|
| * **Vision-Language Model (VLM)** |
| * **Image Generation Model** |
|
|
| The KORMo-VL-Diffusion model, designed for image generation, was trained from scratch with a high proportion of images reflecting Korean daily environments and culture. |
| <span style="color:red">Unfortunately, due to limited GPU resources during the research process, we are sharing the intermediate results of the model at this stage.</span> |
|
|
| --- |
|
|
| KORMo-VL์ KAIST MLP ์ฐ๊ตฌ์ค์์ **from scratch๋ก ๊ฐ๋ฐํ ์๊ฐ-์ธ์ด ๋ชจ๋ธ**๋ก, KORMo-10B๋ฅผ ๊ธฐ๋ฐ์ผ๋ก (1) ์๊ฐ-์ธ์ด ๋ชจ๋ธ๊ณผ (2) ์ด๋ฏธ์ง ์์ฑ ๋ชจ๋ธ๋ก ๊ตฌ์ฑ๋์ด ์์ต๋๋ค. |
|
|
| ์ด ์ค **์ด๋ฏธ์ง ์์ฑ์ ์ํ KORMo-VL-Diffusion** ๋ชจ๋ธ์ ํ๊ตญ์ ์ํ ํ๊ฒฝ๊ณผ ๋ฌธํ๋ฅผ ๋ฐ์ํ๊ธฐ ์ํด ๊ตญ๋ด ํ๊ฒฝ ์ด๋ฏธ์ง๋ฅผ ๊ฐ๋ฅํ ๋์ ๋น์จ๋ก ์ฌ์ฉํ์ฌ **from scratch๋ถํฐ ํ์ต๋ ๋ชจ๋ธ**์
๋๋ค. |
| <span style="color:red">๋ค๋ง ์ฐ๊ตฌ ์งํ ์ค GPU ์์์ ์ถ๊ฐ๋ก ํ๋ณดํ์ง ๋ชปํด **ํ์ฌ๋ ์ค๊ฐ ๊ฒฐ๊ณผ๋ฌผ์ ๊ณต์ ํ๊ฒ ๋์์ต๋๋ค.**</span> |
|
|
| * **LLM:** KORMo-VL |
| * **Model Structure:** Qwen-Image๋ฅผ ๊ตฌ์กฐ๋ฅผ ์ฐธ์กฐํด ์ฌ๊ฐ๋ฐํจ (20B ์ ๋์ Diffusion๋ถ๋ถ์ ๋ณํํด scratch๋ถํฐ ํ์ต) |
| * **Languages:** Korean / English |
| * **Training Data:** Synthetic data + public datasets (e.g., AI Hub, details to be released) |
|
|
| ํฅํ ํด๋น ๋ชจ๋ธ์ ์ถฉ๋ถํ ํ์ตํ ์ ์๋ ํ๊ฒฝ์ด ๋ง๋ จ๋๋ค๋ฉด **์์ฑ๋ ๋ชจ๋ธ๋ก ๋ฐ์ ์ํค๋ ๊ฒ์ ๋ชฉํ๋ก ํ๊ณ ์์ต๋๋ค.** |
| ์ค๊ฐ ๊ฒฐ๊ณผ๋ฌผ ์์์ ์ถ๊ฐ ํ๋์ด๋ ์ฐ๊ตฌ๋ฅผ ์งํํ๊ณ ์ถ์ ๋ถ๋ค์ **์์ ๋กญ๊ฒ ํ์ฉํด ๋ณด์๊ธฐ ๋ฐ๋๋๋ค.** |
|
|
|
|
|
|
| ## ๐ T2I Performance |
| ### English Prompt |
| | Prompt | Generated Image | |
| | :--- | :--- | |
| | **Prompt:** Dense forest | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> | |
| | **Prompt:** Black pattern mug | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/black%20pattern%20mug%20cpup.webp" width="400"> | |
|
|
| ### Korean Prompt |
| | Prompt | Generated Image | |
| | :--- | :--- | |
| | **Prompt:** ์ธ์ฐฝํ ์ฒ | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> | |
| | **Prompt:** ๊ฒ์ ๋ฌด๋ฌ์ ๋จธ๊ทธ์ปต | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/%EA%B2%80%EC%9D%80%20%EB%AC%B4%EB%8A%AC%EC%9D%98%20%EB%A8%B8%EA%B7%B8%EC%BB%B5.webp" width="400"> | |
|
|
|
|
|
|
| ## KORMo-VL-Diffusion Demo |
|
|
| `prompt: ์๋ฆ๋ค์ด ์ ์์ ๊ฝ๋ค` |
|
|
| <video width="640" height="360" controls> |
| <source src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/kormo_diffusion_assets/kormo_t2i.mp4" type="video/mp4"> |
| </video> |
|
|
|
|
| ## ๐ฆ Installation |
|
|
| ```bash |
| uv pip install transformers==4.57.1 pillow torchvision diffusers |
| ``` |
|
|
| --- |
| ## ๐ Inference Example |
| ``` |
| github repo ํ์ฉ ์์ |
| ``` |
|
|
| --- |
|
|
|
|
| ## Contact |
| - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr` |
|
|
| ## Contributor (https://sites.google.com/view/aailab) |
| - Junghun Yuk |
| - INho won |
| - HANGYEOL YOO |
| - Junmyeong Lee |
| - KyungTae Lim |
|
|
| ## Citation |
|
|
| ```text |
| @misc{KORMo, |
| author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh, and KyungTae Lim}, |
| title = {KORMo: Korean Open Reasoning Model for Everyone}, |
| year = {2025}, |
| publisher = {GitHub}, |
| journal = {Technical Report}, |
| paperLink = {\url{https://arxiv.org/abs/2510.09426}}, |
| }, |
| } |
| ``` |