EgoNormia-Cosmos-Reason2-2B-v6b-shortcot

Multi-task SFT fine-tune of nvidia/Cosmos-Reason2-2B on the EgoNormia social norm benchmark. This v6b run extends the MCQ setup with generation tasks: 3 MCQ tasks plus action / justification / sensibility generation, trained with short CoT traces.

Training

Parameter Value
Base model nvidia/Cosmos-Reason2-2B (Qwen3-VL-2B)
Tasks 6 tasks = 3 MCQ + action_gen + justification_gen + sensibility_gen
Train samples 9780
Training file data/egonormia_llava_v6b_train.json
CoT style Short CoT in <think> blocks
CoT length mean 30.9 words, median 30
Epochs 3
Global batch 64 (8 replicas x 8 per replica)
Learning rate 1e-5 (cosine decay, 3% warmup)
Context length 8192
Video input 8 frames
Hardware 8x A100-SXM4-80GB
Run dir outputs/egonormia_sft_v6b/20260302131747/
Uploaded checkpoint step_160 / 456 total steps

MCQ Evaluation (200 verified test samples)

No-think

Checkpoint Action Justification Both S-IoU Parse
v6b step_160 78.5% 88.5% 70.5% 0.6450 100.0%

Think mode

Checkpoint Action Justification Both S-IoU Parse
v6b step_160 + think 81.5% 95.0% 77.5% 0.6292 100.0%
v6b step_320 + think 80.5% 96.0% 77.5% 0.6254 100.0%

Open-ended generation evaluation

Best checkpoint step_160 was also evaluated on 50 open-ended test samples with greedy decoding.

Comparison v3 avg v6b avg v3 wins v6b wins Ties
v3 vs v6b 2.99 3.38 18 (38%) 29 (60%) 1

Notes

  • The repo name keeps the historical shortcot suffix, but this model is actually the 6-task v6b generation variant.
  • In no-think mode, v6b does not clear the 77% MCQ gate, but think mode does: both step_160 and step_320 reach 77.5% both accuracy.
  • Compared with v6, adding the sixth generation task stabilizes parsing instead of hurting it: parse stays at 100% on nearly all checkpoints.
  • v6b is mainly interesting when you care about open-ended generation quality or think-mode MCQ performance.

Usage

from transformers import AutoProcessor, Qwen3VLForConditionalGeneration

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "robertzty/EgoNormia-Cosmos-Reason2-2B-v6b-shortcot",
    torch_dtype="bfloat16",
    device_map="auto",
)
processor = AutoProcessor.from_pretrained("robertzty/EgoNormia-Cosmos-Reason2-2B-v6b-shortcot")
Downloads last month
116
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for robertzty/EgoNormia-Cosmos-Reason2-2B-v6b-shortcot

Finetuned
(8)
this model