--- library_name: transformers license_name: "kanana" license_link: LICENSE pipeline_tag: text-generation model_id: kakaocorp/kanana-2-30b-a3b-thinking-2601 repo: kakaocorp/kanana-2-30b-a3b-thinking-2601 developers: Kanana LLM base_model: - kakaocorp/kanana-2-30b-a3b-mid-2601 ---

Kanana

πŸ€— HF Models   |   πŸ“• Pre-Training Blog   |   πŸ“• Post-Training Blog   |   πŸ“• Teaser Blog  



## News πŸ”₯ - `2026/01/15`: πŸ€— Released `kanana-2-30b-a3b-2601` HF model weights. - `2026/01/15`: πŸ“• Published blog posts ([pre-training](https://tech.kakao.com/posts/807), [post-training](https://tech.kakao.com/posts/808)) about the development of `Kanana-2` models. - `2025/12/19`: πŸ€— Released `kanana-2-30b-a3b` HF model weights and publised a [teaser blog](https://tech.kakao.com/posts/804).
# Kanana-2 Highlights **Kanana-2**, the latest open-source evolution of the Kanana model family, is designed specifically for **Agentic AI**, presenting substantial enhancements in **tool calling, complex instruction following, and logical reasoning**. This new version adopts a cutting-edge architecture featuring MLA (Multi-head Latent Attention) and MoE (Mixture of Experts). These innovations allow the model to utilize significantly fewer active parameters compared to the previous 32.5B model while delivering superior performance and ensuring high throughput. Furthermore, the model **natively supports context lengths of up to 32,768 tokens**, enabling it to maintain coherence when handling extensive documents or long-context interactions. In addition, Kanana-2 now supports 6 languages, covering **Korean, English, Japanese, Chinese, Thai, and Vietnamese**. To support this expansion, Kanana-2 utilizes a newly trained tokenizer that demonstrates superior tokenization efficiency across these languages, including an improvement of over 30% specifically for Korean. Finally, to address advanced problem-solving needs, Kanana-2 introduces **reasoning models** capable of deliberate thinking and reasoning, achieving significantly enhanced performance in downstream tasks, especially when tackling hard problems. > [!NOTE] > No Kakao user data was used for either pre-training or post-training.
## Model Overview **kanana-2-30b-a3b** series has the following features: - Total Parameters: 30B - Activated Parameters: 3B - Number of Layers: 48 - Number of Dense Layers: 1 - Number of Experts: 128 - Number of Selected Experts: 6 - Number of Shared Experts: 2 - Attention Mechanism: MLA - Vocabulary Size: 128256 - Context Length: 32,768
## Model Downloads
| **Model** | **Download** | | :------------: | :------------: | | kanana-2-30b-a3b-base-2601* | [πŸ€— HuggingFace](https://huggingface.co/kakaocorp/kanana-2-30b-a3b-base-2601) | | kanana-2-30b-a3b-mid-2601* | [πŸ€— HuggingFace](https://huggingface.co/kakaocorp/kanana-2-30b-a3b-mid-2601) | | kanana-2-30b-a3b-instruct-2601 | [πŸ€— HuggingFace](https://huggingface.co/kakaocorp/kanana-2-30b-a3b-instruct-2601) | | kanana-2-30b-a3b-thinking-2601 | [πŸ€— HuggingFace](https://huggingface.co/kakaocorp/kanana-2-30b-a3b-thinking-2601) | * We are releasing the kanana-2-30b-a3b-base-2601 (prior to mid-training) checkpoint to contribute to the research community.
  Note: kanana-2-30b-a3b-mid-2601 is identical to kanana-2-30b-a3b-base.

## Performance ### Base model evaluation results
Benchmark Metric Shot kanana-2-30b-a3b-mid-2601 kanana-2-30b-a3b-base-2601 kanana-1.5-32.5b-base Qwen3-30B-A3B-Base*
General Tasks
MMLU acc 5 75.44 74.83 76.76 81.14
MMLU-Pro acc 5 56.14 52.61 52.40 61.83
BBH acc 3 79.76 76.46 81.54 79.97
SimpleQA† acc 5 29.70 29.13 26.95 26.47
Mathematics Tasks
MATH em 4 54.40 48.86 47.68 62.58
GSM8K em 8 82.71 76.57 85.14 88.10
Coding Tasks
HumanEval pass@1 0 75.29 71.34 75.59 53.32
MBPP pass@1 3 62.39 60.21 65.96 72.58
Korean Tasks
KMMLU acc 5 62.15 61.98 61.56 62.25
KoSimpleQA† acc 5 49.70 49.40 45.70 26.33
HAE-RAE Bench (v1.0) acc 5 88.73 88.91 90.65 72.04
MATH-Ko‑ em 4 54.07 45.58 47.42 58.20
GSM8K-Ko‑ em 8 77.48 70.43 81.43 88.10
MBPP-KoΒ§ pass@1 3 61.55 57.29 65.41 66.84
Long Context Tasks
RULER-4K acc 0 93.09 92.49 86.39 94.32
RULER-8K acc 0 92.29 92.14 90.16 92.16
RULER-16K acc 0 90.73 90.01 85.88 91.28
RULER-32K acc 0 88.63 87.92 81.62 88.32
* Evaluated using an internal evaluation toolkit.
† Evaluated in Multiple Choice Question Answering (MCQA) format with 10 options.
‑ Subsets from HRM8K (MATH, GSM8K).
Β§ Internally translated to Korean.

### Instruct model evaluation results
Benchmark Metric kanana-2-30b-a3b-instruct-2601 kanana-2-30b-a3b-instruct kanana-1.5-32.5b-instruct Qwen3-30B-A3B-Instruct-2507* Qwen3-30B-A3B
(non-thinking)*
Chat
MT-Bench judge† 8.30 8.42 8.23 8.71 8.38
KoMT-Bench judge† 8.21 8.24 7.94 8.49 7.89
Instruction Following
IFEval prompt strict 87.25 84.47 79.48 82.62 84.10
IFBench prompt strict 48.30 41.84 38.78 30.27 29.25
Multi-IF (EN) acc 77.88 75.81 68.51 77.93 81.03
Multi-Challenge acc 35.16 34.80 19.05 41.76 27.84
Tool Calling
BFCL-v3
(Live‑)
pass@1 76.66 74.30 68.74 73.93 69.14
BFCL-v3
(Multi-Turn‑)
pass@1 38.63 35.38 11.38 38.77 11.88
Code Generation
HumanEval+ pass@1 81.10 79.88 79.88 86.59 87.20
MBPP+ pass@1 73.02 73.81 71.96 75.13 75.13
Mathematics
GSM8K em 93.10 91.89 91.58 93.56 93.33
MATH acc 88.56 86.26 77.92 90.96 87.20
Reasoning & Knowledge
MMLU em 81.61 80.80 82.75 87.13 85.60
KMMLU em 68.26 67.32 65.75 67.56 63.49
GPQA Diamond pass@1 52.53 42.93 42.42 54.55 50.51
HAERAE-Bench (v1.0) em 75.57 75.57 65.34 53.41 57.39
* Evaluated using an internal evaluation toolkit.
† Evaluated using gpt-4o-2024-08-06 as the judge model.
‑ Live denotes the average score of 6 live benchmarks, and Multi-Turn denotes the average score of 4 multi-turn benchmarks.

### Reasoning model evaluation results
Benchmark Metric kanana-2-30b-a3b-thinking-2601 kanana-2-30b-a3b-thinking Qwen3-30B-A3B-Thinking-2507* Qwen3-30B-A3B
(thinking)*
Reasoning & Knowledge
MMLU-Pro pass@1 74.2 75.3 80.8 78.5
GPQA Diamond pass@1 57.8 61.3 70.6 62.6
Competition Math
AIME 2025 pass@1 74.0 72.7 82.3 70.7
AIME 2024 pass@1 79.0 78.3 91.0 82.7
AIME 2024-Ko† pass@1 75.0 25.3 80.3 72.3
Code Generation
LiveCodeBench pass@1 58.8 60.8 68.3 62.3
LiveCodeBench-Ko‑ pass@1 51.2 9.4 66.3ΒΆ 61.5ΒΆ
Instruction Following
IFEval prompt strict 82.2 82.2 87.8 86.1
IFBench prompt strict 47.8 42.3 47.6 36.7
Tool Calling
BFCL-v3
(LiveΒ§)
pass@1 75.9 75.6 82.9 80.3
BFCL-v3
(Multi-TurnΒ§)
pass@1 43.7 34.3 53.6 35.6
* Evaluated using an internal evaluation toolkit.
† Korean translation of AIME 2024 sourced from MCLM.
‑ Internally translated to Korean.
Β§ Live denotes the average score of 6 live benchmarks, and Multi-Turn denotes the average score of 4 multi-turn benchmarks.
ΒΆ Most responses were generated in English.

## Deployment > [!NOTE] > For optimal results with the reasoning model, please adhere to the default parameters: `temperature=0.6`, `top_p=0.95`, `top_k=20`. **We strongly advise against greedy decoding**, as it may lead to performance degradation and infinite repetition loops. ### vLLM [vLLM](https://github.com/vllm-project/vllm) is a fast and memory-optimized engine designed for high-performance LLM inference and serving. For kanana-2-30b-a3b-instruct-2601, ```shell vllm serve kakaocorp/kanana-2-30b-a3b-instruct-2601 --enable-auto-tool-choice --tool-call-parser hermes ``` For kanana-2-30b-a3b-thinking-2601, ```shell vllm serve kakaocorp/kanana-2-30b-a3b-thinking-2601 --reasoning-parser deepseek_r1 --enable-auto-tool-choice --tool-call-parser hermes ``` ### SGLang [SGLang](https://github.com/sgl-project/sglang) is a high-efficiency framework for serving LLMs and VLMs, enabling easy deployment of OpenAI-compatible API servers. For kanana-2-30b-a3b-instruct-2601, ```shell python3 -m sglang.launch_server --model-path kakaocorp/kanana-2-30b-a3b-instruct-2601 --tool-call-parser qwen ``` For kanana-2-30b-a3b-thinking-2601, ```shell python3 -m sglang.launch_server --model-path kakaocorp/kanana-2-30b-a3b-thinking-2601 --reasoning-parser deepseek-r1 --tool-call-parser qwen ```
## Processing 32K+ Length Currently, the `config.json` uploaded to HuggingFace is configured for token lengths of 32,768 or less. To process tokens beyond this length, YaRN must be applied. By updating the `config.json` with the following parameters, you can apply YaRN to handle token sequences up to 128K in length: ```json "rope_scaling": { "beta_fast": 32, "beta_slow": 1, "factor": 4.0, "mscale": 1.0, "mscale_all_dim": 1.0, "original_max_position_embeddings": 32768, "type": "yarn", }, ``` Passing command line arguments for deployment: - `vllm` ```shell vllm serve ... --hf-overrides '{"max_position_embeddings": 131072, "rope_scaling": {"rope_type":"deepseek_yarn","factor":4.0,"beta_fast":32,"beta_slow":1,"mscale":1.0,"mscale_all_dim":1.0,"original_max_position_embeddings":32768}}' ``` - `sglang` ```shell python3 -m sglang.launch_server ... --json-model-override-args '{"max_position_embeddings":131072, "rope_scaling":{"rope_type":"deepseek_yarn","factor":4.0,"beta_fast":32,"beta_slow":1,"mscale":1.0,"mscale_all_dim":1.0,"original_max_position_embeddings":32768}}' ``` > [!NOTE] > Most leading open-source implementations of static YaRN apply a constant scaling factor, which can negatively impact performance on shorter texts. To ensure optimal performance: > * **Enable `rope_scaling` only when necessary** for processing long contexts. > * **Adjust the `factor` based on your specific needs** (e.g., set `factor` to 2.0 for a 65,536-token context)."
## License The model weights are released under the [Kanana License](./LICENSE).
## Citation ``` @article{, title={Kanana-2 LLM}, author={Kanana LLM}, year={2025}, url={https://huggingface.co/collections/kakaocorp/kanana-2} } ```
## Contact - Kanana LLM Team Technical Support: kanana-llm@kakaocorp.com - Business & Partnership Contact: alpha.k@kakaocorp.com