Self-Training Elicits Concise Reasoning in Large Language Models

This model is fine-tuned using self-training methods to generate concise reasoning paths for reasoning tasks while maintaining accuracy.

Model Details

Developed by: Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun at KAIST AI
Model type: Fine-tuned Large Language Model for concise reasoning
Language(s) (NLP): English
License: MIT
Finetuned from model: deepseek-ai/deepseek-math-7b-instruct
Repository: https://github.com/TergelMunkhbat/concise-reasoning
Paper: Self-Training Elicits Concise Reasoning in Large Language Models
Demo: HuggingFace Space demo

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "tergel/deepseek-math-7b-instruct-gsm8k-fs-gpt4o-bon"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map=device, torch_dtype=torch.bfloat16)

question = "A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?"

inputs = tokenizer(question, return_tensors="pt").to(device)
input_length = len(inputs['input_ids'][0])

outputs = model.generate(**inputs, max_new_tokens=512)

response = tokenizer.decode(outputs[0][input_length:], skip_special_tokens=True)
print(response)

For more detailed information about training methods, evaluation results, limitations, and technical specifications, please refer to our paper.

Citation

@article{munkhbat2025self,
  title={Self-Training Elicits Concise Reasoning in Large Language Models},
  author={Munkhbat, Tergel and Ho, Namgyu and Kim, Seohyun and Yang, Yongjin and Kim, Yujin and Yun, Se-Young},
  journal={arXiv preprint arXiv:2502.20122},
  year={2025}
}

Downloads last month: 1

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for tergel/deepseek-math-7b-instruct-gsm8k-fs-gpt4o-bon

Base model

deepseek-ai/deepseek-math-7b-instruct

Finetuned

(33)

this model

Paper for tergel/deepseek-math-7b-instruct-gsm8k-fs-gpt4o-bon

Self-Training Elicits Concise Reasoning in Large Language Models

Paper • 2502.20122 • Published Feb 27, 2025 • 4