Distil-Qwen3-0.6B-SHELLper

SHELLper

A fine-tuned Qwen3-0.6B model for multi-turn bash function calling. Trained using knowledge distillation from Qwen3-235B, this 0.6B parameter model achieves 100% tool-call accuracy on our test set over multiple turns while being small enough to run locally on any machine.

Results

Metric Qwen3-235B (Teacher) Qwen3-0.6B (Base) This Model
Tool call accuracy 99% 84.16% 100%
5-turn accuracy 95% 42.22% 100%

Quick Start

You can follow instructions in our demo repository: github

Using Ollama

# Download and create Ollama model
hf download distil-labs/distil-qwen3-0.6b-SHELLper model_fp16.gguf Modelfile --local-dir distil_model
cd distil_model && ollama create distil_model -f Modelfile
cd ..

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-SHELLper")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-SHELLper")

tools = [
    {
        "type": "function",
        "function": {
            "name": "ls",
            "description": "List directory contents",
            "parameters": {
                "type": "object",
                "properties": {
                    "folder": {"type": "string", "description": "Path to the folder to list"}
                },
                "required": ["folder"]
            }
        }
    }
]

messages = [
    {"role": "system", "content": "You are a helpful assistant that executes bash commands."},
    {"role": "user", "content": "List all files in the current directory"}
]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Details

Property Value
Base Model Qwen/Qwen3-0.6B
Parameters 0.6 billion
Architecture Qwen3ForCausalLM
Context Length 40,960 tokens
Precision bfloat16
Training Data ~200 synthetic examples (expanded from 20 seeds)
Teacher Model Qwen3-235B-A22B-Instruct

Training

This model was trained using the Distil Labs platform:

  • Seed Data: 20 hand-validated multi-turn bash command conversations
  • Synthetic Generation: Expanded using Qwen3-235B teacher with conversation turn expansion
  • Fine-tuning: 4 epochs with LoRA (r=128) on the synthetic dataset
  • Evaluation: Multi-turn accuracy testing across variable conversation lengths

Training Hyperparameters

  • Epochs: 4
  • Learning Rate: 2e-5 (linear schedule)
  • Batch Size: 1 (with gradient accumulation)
  • LoRA Rank: 128

Task Format

Input Format

Multi-turn conversation with tool definitions:

[
  {"role": "user", "content": "List all files in the current directory"},
  {"role": "assistant", "tool_calls": [{"function": {"name": "ls", "arguments": {"folder": "."}}}]},
  {"role": "user", "content": "Now go to the src folder"}
]

Output Format

A single tool call in JSON format:

{"name": "cd", "arguments": {"folder": "src"}}

Supported Tools

The model supports 20 bash commands:

Command Description
cat Display file contents
cd Change directory
cp Copy files or directories
diff Compare files
du Estimate file space usage
echo Display text
find Search for files
grep Search file contents
head Output first part of files
ls List directory contents
mkdir Create directories
mv Move or rename files
pwd Print working directory
rm Remove files
rmdir Remove empty directories
sort Sort file contents
tail Output last part of files
touch Create empty files
wc Word, line, character count

Use Cases

  • Natural language interfaces to file systems
  • Command-line assistants and automation
  • Developer productivity tools
  • Educational tools for learning bash
  • Local, privacy-preserving AI assistants

Limitations

  • Optimized for single tool call per turn
  • No support for pipes or combined commands
  • Best with up to 5 conversation turns
  • Trained on English requests only
  • Limited to the 20 supported bash commands

License

This model is released under the Apache 2.0 license.

Links

Citation

@misc{distil-qwen3-0.6b-shellper,
  author = {Distil Labs},
  title = {Distil-Qwen3-0.6B-SHELLper: A Fine-tuned Model for Multi-turn Bash Function Calling},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-SHELLper}
}
Downloads last month
248
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for distil-labs/distil-qwen3-0.6b-SHELLper

Finetuned
Qwen/Qwen3-0.6B
Quantized
(259)
this model

Collection including distil-labs/distil-qwen3-0.6b-SHELLper