tCHu Qwen3-VL-2B LoRA Adapter

A LoRA adapter for Qwen3-VL-2B-Instruct finetuned to play tCHu (Swiss Ticket to Ride) from game screenshots.

Model Description

This adapter was trained via supervised finetuning (SFT) to distill the knowledge of an MLP policy network into a vision-language model. The VLM sees a 640x410 screenshot of the game board and outputs a tool-call action.

Training Details

Setting	Value
Base model	Qwen/Qwen3-VL-2B-Instruct
Method	LoRA SFT (r=16, alpha=32, dropout=0.05)
Quantization	4-bit NF4 (bitsandbytes)
Target modules	q/k/v/o_proj, gate/up/down_proj
Trainable params	17.4M / 2.1B (0.8%)
Training data	3,864 decision points from 50 self-play games
Image resolution	640x410
Epochs	1
Final loss	0.627
Training regime	bf16 mixed precision
GPU	NVIDIA RTX 4060 (8GB)
Training time	~90 minutes
PEFT version	0.17.1

Teacher Model

The training data was generated by a TicketHybridPlayer — an MLP policy network (BC Round 4, 512 hidden units) with ticket-aware logit boosting (boost_weight=120). Both players in the self-play games used the same model, averaging 72.4 points per game.

Action Format

The model outputs MCP tool-call strings:

draw_card(slot=-1) — draw from deck
draw_card(slot=2) — draw face-up card at position 2
claim_route(route_id="BER_LUC_1") — claim the Berne-Lucerne route
draw_tickets() — draw new destination tickets

Usage

With tCHuPy (game GUI)

git clone https://github.com/DavidLacour/tCHuPy
cd tCHuPy

# Download this adapter
python -c "from huggingface_hub import snapshot_download; snapshot_download('DavidLacour/tchu-qwen3-vl-2b-lora', local_dir='models/distilled_bc4_vlm/checkpoint-distilled')"

# Play against the VLM
pip install pygame torch transformers peft bitsandbytes accelerate
python -m tchu.gui --opponent=qwen

Standalone Loading

import torch
from transformers import AutoProcessor, AutoModelForImageTextToText, BitsAndBytesConfig
from peft import PeftModel

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

base_model = AutoModelForImageTextToText.from_pretrained(
    "Qwen/Qwen3-VL-2B-Instruct",
    quantization_config=bnb_config,
    device_map={"": 0},
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base_model, "DavidLacour/tchu-qwen3-vl-2b-lora")
processor = AutoProcessor.from_pretrained("DavidLacour/tchu-qwen3-vl-2b-lora")

Limitations

Trained on only 3,864 samples (1 epoch) — limited generalization
Falls back to a heuristic strategy when the VLM output is unparseable or illegal
Requires ~4GB VRAM with 4-bit quantization
Action distribution in training data is skewed (80% draw, 20% claim, 0% tickets)

About tCHu

tCHu is the Swiss version of Ticket to Ride — a two-player railway board game set on a map of Switzerland with 51 stations, 88 routes, and 46 destination tickets. See the tCHuPy repository for the full game implementation.

Downloads last month: 16

Model tree for DavidLacour/tchu-qwen3-vl-2b-lora

Base model

Qwen/Qwen3-VL-2B-Instruct

Adapter

(33)

this model