Pathumma-ThaiLLM-Think-3.0.0

Post-trained Thai Large Language Model built upon the foundation model from the Thai national initiative ThaiLLM.

This release applies multi-stage Supervised Fine-Tuning (SFT) to enhance:

Instruction following
Structured tool / function calling
Mathematical and coding competence
Multi-step analytical capability
Thai–English bilingual robustness

Training Strategy

Post-training is organized into two stages:

Stage 1: Instruction & Tool-Calling Alignment
Stage 2: Reasoning Specialization

For selected corpora, only curated subsets were used to maintain domain balance.

Stage 1: Instruction & Tool-Calling Alignment

Focus areas:

Instruction compliance
Structured tool-call formatting
General Thai task robustness
STEM-oriented instruction alignment

Datasets

Dataset	Training Subset Size	Full Dataset Size	Domain	License
beyoru/ToolCall_synthetic_qwen3	60,000	60,000	Tool	Apache-2.0
airesearch/WangchanX-FLAN-v6	2,000,000	13,619,450	General	Mixed
nvidia/OpenMathInstruct-2	1,000,000	14,000,000	STEM	CC-BY-4.0
jdaddyalbs/playwright-mcp-toolcalling	1,750	1,750	Tool	MIT
BitAgent/tool_calling	551,000	551,000	Tool	MIT

Stage 2: Reasoning Specialization

Focus areas:

Multi-step mathematical analysis
Code understanding and synthesis
Structured analytical responses
Tool-calling with explicit reasoning traces
Thai reasoning distillation

Datasets

Dataset	Training Subset Size	Full Dataset Size	Domain	License
nvidia/OpenMathReasoning	500,000	4,920,000	STEM	CC-BY-4.0
nvidia/OpenCodeReasoning	585,000	585,000	Coding	CC-BY-4.0
natolambert/GeneralThought-430K-filtered	337,579	337,579	General	MIT
Jofthomas/hermes-function-calling-thinking-V1	3,570	3,570	Tool	MIT
open-thoughts/OpenThoughts3-1.2M	1,200,000	1,200,000	STEM	Apache-2.0
scb10x/typhoon-r1-sft-data	23,851	23,851	General	Custom
iapp/Thai-R1-Distill-SFT	10,000	10,000	General	Custom
nvidia/Nemotron-Post-Training-Dataset-v1	310,000	310,000	Tool	CC-BY-4.0

Note: For selected datasets, curated subsets were employed to ensure balanced domain representation.

Methodology

Base model: ThaiLLM foundation model
Training objective: Supervised Fine-Tuning (SFT)
Two-stage curriculum design
Domain-balanced optimization
Tool-call schema alignment
Thai reasoning distillation

Compute Infrastructure

Training was conducted on the LANTA high-performance computing cluster, utilizing 16 nodes (64×A100 40GB GPUs) for distributed large-scale post-training.

Capabilities

Thai instruction compliance
Structured JSON tool invocation
Mathematical problem solving
Code generation and analysis
Multi-step analytical tasks
Thai–English bilingual support

Limitations

May hallucinate if tool schema is incomplete
Performance on long analytical chains may degrade without retrieval
Domain coverage depends on included corpora

Quickstart

The code of Qwen3 has been in the latest Hugging Face transformers and we advise you to use the latest version of transformers. With transformers<4.51.0, you will encounter the following error:

KeyError: 'qwen3'

The following contains a code snippet illustrating how to use the model generate content based on given inputs.

from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "nectec/pathumma-thaillm-8b-think-3.0.0"
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
# prepare the model input
prompt = "ทำไมวงกลมถึงมี 360 องศา"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 
# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
print("thinking content:", thinking_content) # no opening <think> tag
print("content:", content)

For deployment, you can use vllm>=0.8.5 to create an OpenAI-compatible API endpoint:

vllm serve nectec/pathumma-thaillm-8b-think-3.0.0 \
  --enforce-eager \
  --no-enable-chunked-prefill \
  --tool-call-parser hermes

For local use, applications such as Ollama, LMStudio, and llama.cpp have also supported.

About the Project

Pathumma-ThaiLLM-Think-3.0.0 is part of ongoing research toward sovereign Thai large language models optimized for analytical and tool-augmented intelligence.

Contributor Contract

LLM Team
Piyawat Chuangkrud (piyawat@it.kmitl.ac.th)
Chanon Utupon (s6401001620165@email.kmutnb.ac.th)
Jessada Pranee (jessada.pran@kmutt.ac.th)
Arnon Saeoung (anon.saeoueng@gmail.com)
Chaianun Damrongrat (chaianun.damrongrat@nectec.or.th)
Sarawoot Kongyoung (sarawoot.kongyoung@nectec.or.th)

Downloads last month: 56

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including nectec/pathumma-thaillm-8b-think-3.0.0

Pathumma-ThaiLLM-3.0.0

Collection

2 items • Updated 29 days ago • 1