Distil-Qwen3-0.6B-SHELLper

SHELLper

A fine-tuned Qwen3-0.6B model for multi-turn bash function calling. Trained using knowledge distillation from Qwen3-235B, this 0.6B parameter model achieves 100% tool-call accuracy on our test set over multiple turns while being small enough to run locally on any machine.

Results

Metric	Qwen3-235B (Teacher)	Qwen3-0.6B (Base)	This Model
Tool call accuracy	99%	84.16%	100%
5-turn accuracy	95%	42.22%	100%

Quick Start

You can follow instructions in our demo repository: github

Using Ollama

# Download and create Ollama model
hf download distil-labs/distil-qwen3-0.6b-SHELLper model_fp16.gguf Modelfile --local-dir distil_model
cd distil_model && ollama create distil_model -f Modelfile
cd ..

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-SHELLper")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-SHELLper")

tools = [
    {
        "type": "function",
        "function": {
            "name": "ls",
            "description": "List directory contents",
            "parameters": {
                "type": "object",
                "properties": {
                    "folder": {"type": "string", "description": "Path to the folder to list"}
                },
                "required": ["folder"]
            }
        }
    }
]

messages = [
    {"role": "system", "content": "You are a helpful assistant that executes bash commands."},
    {"role": "user", "content": "List all files in the current directory"}
]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Details

Property	Value
Base Model	Qwen/Qwen3-0.6B
Parameters	0.6 billion
Architecture	Qwen3ForCausalLM
Context Length	40,960 tokens
Precision	bfloat16
Training Data	~200 synthetic examples (expanded from 20 seeds)
Teacher Model	Qwen3-235B-A22B-Instruct

Training

This model was trained using the Distil Labs platform:

Seed Data: 20 hand-validated multi-turn bash command conversations
Synthetic Generation: Expanded using Qwen3-235B teacher with conversation turn expansion
Fine-tuning: 4 epochs with LoRA (r=128) on the synthetic dataset
Evaluation: Multi-turn accuracy testing across variable conversation lengths

Training Hyperparameters

Epochs: 4
Learning Rate: 2e-5 (linear schedule)
Batch Size: 1 (with gradient accumulation)
LoRA Rank: 128

Task Format

Input Format

Multi-turn conversation with tool definitions:

[
  {"role": "user", "content": "List all files in the current directory"},
  {"role": "assistant", "tool_calls": [{"function": {"name": "ls", "arguments": {"folder": "."}}}]},
  {"role": "user", "content": "Now go to the src folder"}
]

Output Format

A single tool call in JSON format:

{"name": "cd", "arguments": {"folder": "src"}}

Supported Tools

The model supports 20 bash commands:

Command	Description
`cat`	Display file contents
`cd`	Change directory
`cp`	Copy files or directories
`diff`	Compare files
`du`	Estimate file space usage
`echo`	Display text
`find`	Search for files
`grep`	Search file contents
`head`	Output first part of files
`ls`	List directory contents
`mkdir`	Create directories
`mv`	Move or rename files
`pwd`	Print working directory
`rm`	Remove files
`rmdir`	Remove empty directories
`sort`	Sort file contents
`tail`	Output last part of files
`touch`	Create empty files
`wc`	Word, line, character count

Use Cases

Natural language interfaces to file systems
Command-line assistants and automation
Developer productivity tools
Educational tools for learning bash
Local, privacy-preserving AI assistants

Limitations

Optimized for single tool call per turn
No support for pipes or combined commands
Best with up to 5 conversation turns
Trained on English requests only
Limited to the 20 supported bash commands

License

This model is released under the Apache 2.0 license.

Citation

@misc{distil-qwen3-0.6b-shellper,
  author = {Distil Labs},
  title = {Distil-Qwen3-0.6B-SHELLper: A Fine-tuned Model for Multi-turn Bash Function Calling},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-SHELLper}
}

Downloads last month: 248

Safetensors

Model size

0.6B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for distil-labs/distil-qwen3-0.6b-SHELLper

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Quantized

(259)

this model

Collection including distil-labs/distil-qwen3-0.6b-SHELLper

Distil Dev Deputies

Collection

4 items • Updated 28 days ago

distil-labs
/

distil-qwen3-0.6b-SHELLper