zipaltrivedi commited on
Commit
2c65a8e
·
verified ·
1 Parent(s): 12c11eb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -29,7 +29,8 @@ Designed for coding agents and experienced .NET developers who need compilable,
29
  |---|---|
30
  | **Parameters** | 14.7B |
31
  | **Base Model** | Qwen2.5-Coder-14B-Instruct |
32
- | **Context Length** | 32,768 tokens (trained on 2,048) |
 
33
  | **Training Method** | QLoRA SFT + Iterative DPO |
34
  | **Training Data** | 107K C# records |
35
  | **License** | Apache 2.0 |
@@ -208,7 +209,7 @@ The model handles both code generation and interactive debugging — it can diag
208
 
209
  **Inference parameters**: temperature=0.2, top_p=0.9, max_new_tokens=2048
210
 
211
- The model was fine-tuned on sequences up to 2048 tokens (prompt + response). It will stop generating when done (EOS token), so setting a higher limit won't cause unnecessary output. For tasks requiring longer output, the base model's 32K context still applies, but quality may vary beyond the training distribution.
212
 
213
  ## Usage
214
 
 
29
  |---|---|
30
  | **Parameters** | 14.7B |
31
  | **Base Model** | Qwen2.5-Coder-14B-Instruct |
32
+ | **Max Context** | 32,768 tokens (base model) |
33
+ | **Trained Sequence Length** | 2,048 tokens |
34
  | **Training Method** | QLoRA SFT + Iterative DPO |
35
  | **Training Data** | 107K C# records |
36
  | **License** | Apache 2.0 |
 
209
 
210
  **Inference parameters**: temperature=0.2, top_p=0.9, max_new_tokens=2048
211
 
212
+ The base model supports up to 32,768 tokens, so you can use the full 32K context window. The fine-tuning was done on sequences up to 2,048 tokens the model performs best within this range but still works beyond it thanks to the base model's capabilities. The model will stop generating when done (EOS token), so setting a higher limit won't cause unnecessary output.
213
 
214
  ## Usage
215