prakod
/

codemix-indicBART_L1_to_CM_candidates_acc4.7

text2text-generation

Generated from Trainer

Model card Files Files and versions

prakod commited on Jun 18, 2025

Commit

2355abb

·

verified ·

1 Parent(s): 72e6e71

Model save

Files changed (1) hide show

README.md +11 -11

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [ai4bharat/IndicBART](https://huggingface.co/ai4bharat/IndicBART) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.0987
-- Bleu: 13.9044
 - Gen Len: 21.0
 ## Model description
@@ -38,12 +38,12 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 24
-- eval_batch_size: 24
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 96
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 5
@@ -53,11 +53,11 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step  | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:|
-| 1.5602        | 1.0    | 5031  | 1.2958          | 13.8343 | 21.0    |
-| 1.3896        | 2.0    | 10062 | 1.1714          | 13.8416 | 21.0    |
-| 1.3129        | 3.0    | 15093 | 1.1251          | 13.9963 | 21.0    |
-| 1.2768        | 4.0    | 20124 | 1.1051          | 13.8623 | 21.0    |
-| 1.2599        | 4.9991 | 25150 | 1.0987          | 13.9044 | 21.0    |
 ### Framework versions

 This model is a fine-tuned version of [ai4bharat/IndicBART](https://huggingface.co/ai4bharat/IndicBART) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.2986
+- Bleu: 11.9231
 - Gen Len: 21.0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-06
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 64
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 5
 | Training Loss | Epoch  | Step  | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:|
+| 3.7106        | 1.0    | 7546  | 3.3985          | 13.2137 | 21.0    |
+| 3.2584        | 2.0    | 15092 | 2.8989          | 12.9778 | 20.992  |
+| 2.9447        | 3.0    | 22638 | 2.5509          | 14.0866 | 21.0    |
+| 2.7786        | 4.0    | 30184 | 2.3583          | 12.4674 | 21.0    |
+| 2.7111        | 4.9994 | 37725 | 2.2986          | 11.9231 | 21.0    |
 ### Framework versions