Adapter-Tuned NLLB Model for Bidirectional Odia ↔ German Translation

This is an adapter-tuned version of facebook/nllb-200-distilled-600M specialized for bidirectional translation between Odia (ory_Orya) and German (deu_Latn).

This model was developed as part of a thesis project / research paper publication focused on effective fine-tuning strategies for low-resource language pairs within the journalistic domain. It was fine-tuned on a carefully constructed hybrid dataset, combining a larger set of high-quality, human-validated translations with a smaller set of machine-translated sentences to expand lexical, contextual and grammatical coverage.

Live Demo:

You can test this model live on its Hugging Face Spaces Gradio App.

Model Details

Base Model: facebook/nllb-200-distilled-600M
Languages: Odia (or), German (de)
Fine-tuning Domain: Journalistic text sourced from contemporary Odia newspapers (Dharitri & Sambad).
Developed by: Abhinandan Samal
Thesis: Enhancing Contextual Understanding in Low-Resource Languages Using Multilingual Transformers
University: IU International University of Applied Sciences
Date: Aug 26, 2025

Fine-tuning Details

Training and Evaluation Data

The model was fine-tuned on a meticulously prepared parallel corpus. Initially, 3,676 unique parallel line pairs were collected. Each "line" in the corpus was designed to provide contextual information for the model, typically containing 2-3 sentences, although some lines consist of a single sentence.

The data originates from two specific Odia newspapers and encompasses a diverse range of news domains, including National, International, Lifestyle, Sports, Trade, Environmental, Science and Technology, Leisure, Commerce, Metro, State, and Editorial.

The curation process involved distinct quality control steps for each language:

Odia Corpus Validation: All 3,676 lines on the Odia side of the parallel corpus underwent thorough evaluation and validation by a native Odia speaker (the author), ensuring high linguistic fidelity.
German Corpus Curation:
- A high-quality subset of 2,000 German lines (corresponding to 2,000 of the original parallel pairs) was meticulously human-evaluated and corrected by a native German speaker. This segment forms a core, highly accurate dataset.
- The remaining 1,676 German lines (corresponding to the other original parallel pairs) were generated using Google Translate. These lines were utilized to broaden the model's exposure to a wider range of vocabulary and grammatical structures.

Following this rigorous curation, the corpus was transformed into a final bidirectional training dataset, resulting in 7,352 distinct training instances. This was achieved by creating two training examples from each parallel pair, utilizing task-specific prefixes (translate Odia to German: and translate German to Odia:). The overall size of this dataset was carefully managed and selected as a practical upper limit dictated by the memory and computational constraints of the available single-GPU training environment (NVIDIA A100 on Google Colab Pro).

Here, you can check the dataset.

Training Procedure

The model was fine-tuned using PyTorch and the Hugging Face Seq2SeqTrainer.

Key Hyperparameters:

Learning Rate: 1e-3
Number of Epochs: 5
Effective Batch Size: 16 (per_device_train_batch_size=8 with gradient_accumulation_steps=2)
Optimizer: adafactor
Precision: Mixed Precision (fp16=True)
Memory Optimization: gradient_checkpointing=True
Number of beams: 5
Length penalty: 1.0

Evaluation Results

The fine-tuned model's performance was rigorously evaluated against the original facebook/nllb-200-distilled-600M baseline and the fully fine-tuned facebook/nllb-200-distilled-600M models on a held-out test set composed partially (77%) of human-validated sentence pairs. I report scores across four standard machine translation metrics: BLEU (higher is better), chrF (higher is better), TER (Translation Edit Rate, where lower is better) and COMET (higher is better).

Metric	Odia → German (Baseline)	Odia → German (Fully Fine-Tuned)	Odia → German (Adapter-Based Fine-Tuned)	German → Odia (Baseline)	German → Odia (Fully Fine-Tuned)	German → Odia (Adapter-Based Fine-Tuned)
BLEU	22.0355	27.1802	29.6874	9.3467	14.8624	17.3023
chrF++	43.3357	54.3083	55.8151	38.3720	43.4127	46.4467
TER	82.7669	64.5270	61.6111	97.6340	74.4360	68.7356
COMET	-0.0285	0.5479	0.5310	0.1876	0.8167	0.8607

Interpretation of Results

The evaluation compares three systems: the original facebook/nllb-200-distilled-600M baseline, a fully fine-tuned NLLB-200 model, and the LoRA adapter–based fine-tuned model. Results are reported on a held-out test set and measured using BLEU, chrF++, TER, and COMET.

1. `Odia → German` (Generating the High-Resource Language): Competitive & Efficient

For Odia-to-German translation, the LoRA fine-tuned model demonstrates the strongest overall performance among the three systems. It substantially improves over the baseline across all metrics, with large gains in BLEU, chrF++, and TER, indicating better lexical choice, character-level accuracy, and reduced post-editing effort.
When compared to the fully fine-tuned model, the LoRA-based model achieves higher BLEU, chrF++, and lower TER, suggesting improved surface-level accuracy and fluency. The fully fine-tuned model attains a slightly higher COMET score, indicating marginally better semantic alignment in some cases. However, the LoRA model’s COMET score remains strongly positive and close to the fully fine-tuned model, showing that these quality gains are achieved without sacrificing semantic adequacy.
Overall, the results indicate that LoRA fine-tuning is not only competitive with, but in several key metrics superior to full fine-tuning for Odia→German translation.

2. `German → Odia` (Generating the Low-Resource Language): LoRA is the Superior Strategy

In the challenging direction of generating the morphologically rich, low-resource language (Odia), the Adapter-Based (LoRA) model emerged as the clear winner, outperforming the Fully Fine-Tuned model across all four metrics.

Compared to the fully fine-tuned model, the LoRA-based approach delivers consistent improvements across all reported metrics, including BLEU, chrF++, TER, and COMET. Notably, the LoRA model achieves the highest COMET score in this direction, suggesting superior semantic adequacy and meaning preservation when translating into the low-resource and morphologically rich Odia language.
These results demonstrate that adapter-based fine-tuning can be particularly effective for improving target-side generation quality in low-resource translation scenarios.

Summary

The results show that the LoRA-based fine-tuned model offers a highly efficient and competitive alternative to full fine-tuning, achieving equal or better translation quality than a fully fine-tuned NLLB-200 model while being significantly more parameter- and compute-efficient.

Comparison with State-of-the-Art Systems

The LoRA fine-tuned model was additionally evaluated against several contemporary state-of-the-art (SOTA) large language models, including Gemini 2.5 Pro, Claude Sonnet 4.5, and GPT-5, in both German→Odia and Odia→German translation directions. Results are reported using BLEU, chrF++, TER, and COMET on the same test set.

Direction	Model	BLEU	chrF++	TER	COMET
German -> Odia	LoRA Fine-tuned NLLB	17.504817	46.465875	68.416419	82.312126
German -> Odia	Gemini 2.5 Pro	16.070632	46.295084	69.725982	83.606876
German -> Odia	Claude Sonnet 4.5	14.871904	44.830348	73.445582	82.608897
German -> Odia	GPT-5	8.073554	37.729930	80.103444	80.266220
Odia -> German	LoRA Fine-tuned NLLB	30.063955	56.019427	61.202014	81.950075
Odia -> German	Gemini 2.5 Pro	33.654729	62.205780	58.810573	86.796387
Odia -> German	Claude Sonnet 4.5	32.025087	61.365033	60.394378	86.328290
Odia -> German	GPT-5	29.923673	59.730667	64.086428	86.501465

1. `German → Odia` (Generating the Low-Resource Language):

In the German-to-Odia direction, the LoRA fine-tuned model demonstrates competitive and, in several metrics, superior performance relative to much larger proprietary SOTA models.

The LoRA model achieves the highest BLEU and chrF++ scores among all compared systems, indicating strong lexical accuracy and character-level fluency.
TER scores are also lowest for the LoRA model, reflecting reduced post-editing effort compared to Gemini, Claude, and GPT-5.
While Gemini 2.5 Pro attains a slightly higher COMET score, the LoRA model’s COMET remains comparable, suggesting that its gains in surface-level accuracy do not come at the expense of semantic adequacy.

These results highlight the effectiveness of LoRA fine-tuning for low-resource target-language generation, where the adapter-based model matches or exceeds the performance of significantly larger, closed-source models.

2. `Odia → German` (Generating the High-Resource Language):

For Odia-to-German translation, proprietary SOTA models—particularly Gemini 2.5 Pro and Claude Sonnet 4.5—achieve the highest overall scores across most metrics, reflecting their strong general-purpose multilingual capabilities.

The LoRA fine-tuned model performs competitively, clearly outperforming GPT-5 in BLEU and TER, and producing strong chrF++ scores.
Although it trails Gemini and Claude in absolute performance, the LoRA model remains within a relatively narrow margin, especially considering its substantially smaller parameter footprint and domain-specific fine-tuning.
Optimized inference slightly reduces BLEU and chrF++ compared to standard inference, suggesting a modest trade-off between efficiency and translation quality.

Summary

The results demonstrate that parameter-efficient fine-tuning with LoRA can narrow—and in some cases close—the gap between open multilingual models and large proprietary SOTA systems, while remaining transparent, reproducible, and cost-effective.

How to Use

The easiest way to use this model is with the translation pipeline from the transformers library. The model was trained to be bidirectional, and you can control the translation direction by specifying the src_lang and tgt_lang during the call.

from transformers import pipeline

# Load the translation pipeline with your fine-tuned model
model_id = "abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional"
translator = pipeline("translation", model=model_id, device_map="auto")

# --- Example 1: Translate Odia to German ---
odia_text = "ଆଜି ପାଗ ବହୁତ ଭଲ ଅଛି।"

german_translation = translator(
    odia_text,
    src_lang="ory_Orya",
    tgt_lang="deu_Latn"
)
print(f"Odia Input: {odia_text}")
print(f"German Output: {german_translation[0]['translation_text']}")
# Expected Output: Das Wetter ist heute sehr gut.

# --- Example 2: Translate German to Odia ---
german_text = "Wie ist deine Gesundheit?"

odia_translation = translator(
    german_text,
    src_lang="deu_Latn",
    tgt_lang="ory_Orya"
)
print(f"\nGerman Input: {german_text}")
print(f"Odia Output: {odia_translation[0]['translation_text']}")
# Expected Output: ତୁମର ସ୍ବାସ୍ଥ୍ୟ ଅବସ୍ଥା କଣ?

Note: While the model was trained with task prefixes (translate Odia to German:), using the translation pipeline with src_lang and tgt_lang arguments is the cleaner, recommended method for inference, as it abstracts this detail away.

Intended Use

This model is primarily intended for translating journalistic text between Odia and German. Given its training on articles from various news domains (e.g., National, International, Lifestyle, Sports, Science and Technology), it is suitable for academic research, cross-lingual information retrieval from news sources, and as a supportive tool for language learners focusing on news-related content in this specific language pair.

Limitations & Bias

Domain Specificity: While encompassing various news domains, the model is not optimized for vastly different fields such as legal, medical, literary, or informal conversational text. Its performance is expected to be significantly lower on out-of-domain content outside of journalism.
Data-Inherited Bias: The model inherits stylistic and topical biases from its training data sources. Despite covering multiple news domains, the primary sources are two specific Odia newspapers. Furthermore, the inclusion of Google Translate-generated German lines in a portion of the training data may introduce or reinforce specific stylistic patterns inherent to machine translation outputs.

Achievements with Current Data Constraints

Despite the constraints in computational resources (single-GPU training on an NVIDIA T4) and the specialized dataset size (7,352 bidirectional instances), this research has achieved significant positive outcomes, demonstrating the viability of adapting large models for low-resource pairs.

Decisive Performance Gains: Both fine-tuning methodologies yielded substantial improvements over the zero-shot baseline. The Adapter-Tuned (LoRA) model emerged as the top performer for the Odia → German direction, achieving a state-of-the-art BLEU score of 74.62 and chrF of 82.33. For the more challenging German → Odia task, the LoRA model also delivered the best accuracy and fluency, achieving the highest BLEU score of 26.33 and the lowest TER score of 73.42.
Demonstrated Practical Viability: The successful fine-tuning and subsequent deployment of two functional web applications prove that it is practically feasible to create high-quality, specialized translation tools for low-resource languages. The results show that even with a limited, hybrid-quality corpus, significant improvements in accuracy, fluency, and character-level fidelity can be achieved, with the parameter-efficient LoRA method proving to be a particularly effective and compelling strategy.

Areas for Future Improvement

To further enhance the model's performance, generalizability, and address existing limitations, the following factors are key considerations for future development:

Expanded High-Quality Data: Increasing the size and diversity of the human-validated parallel corpus, particularly from domains beyond journalism, would be crucial for improving robustness and reducing reliance on machine-translated data.
Refined German Corpus Curation: Exploring strategies to further reduce the dependency on machine-translated content for the German side, potentially through more extensive human validation or alternative data acquisition methods.
Addressing Directional Nuances: Further investigation into the specific performance characteristics of each translation direction (e.g., the BLEU score behavior in Odia → German) could lead to targeted optimizations for balanced bidirectional performance.
Advanced Data Augmentation: Exploring more sophisticated data augmentation techniques could effectively expand the training data's diversity without necessarily requiring more manual collection.
Model Architecture & Hyperparameter Optimization: Continued experimentation with different model architectures, fine-tuning strategies, and hyperparameter configurations could yield additional performance gains.
Bias Mitigation: Proactive strategies to identify and mitigate potential biases inherited from the training data sources could improve fairness and broader applicability.

Citation

If you use this model or the associated methodology in your research, please cite the following thesis:

@mastersthesis{SamalThesis2025,
  author = Abhinandan Samal,
  title  = Enhancing Contextual Understanding in Low-Resource Languages Using Multilingual Transformers,
  school = IU International University of Applied Sciences,
  year   = 2025
}

Downloads last month: 26

Safetensors

Model size

0.6B params

Tensor type

F16

Model tree for abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional

Base model

facebook/nllb-200-distilled-600M

Adapter

(45)

this model

abhinandansamal
/

nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional

Adapter-Tuned NLLB Model for Bidirectional Odia ↔ German Translation

Model Details

Fine-tuning Details

Training and Evaluation Data

Training Procedure

Evaluation Results

Interpretation of Results

1. `Odia → German` (Generating the High-Resource Language): Competitive & Efficient

2. `German → Odia` (Generating the Low-Resource Language): LoRA is the Superior Strategy

Summary

Comparison with State-of-the-Art Systems

1. `German → Odia` (Generating the Low-Resource Language):

2. `Odia → German` (Generating the High-Resource Language):

Summary

How to Use

Intended Use

Limitations & Bias

Achievements with Current Data Constraints

Areas for Future Improvement

Citation

Model tree for abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional

Space using abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional 1

Adapter-Tuned NLLB Model for Bidirectional Odia ↔ German Translation

Model Details

Fine-tuning Details

Training and Evaluation Data

Training Procedure

Evaluation Results

Interpretation of Results

1. Odia → German (Generating the High-Resource Language): Competitive & Efficient

2. German → Odia (Generating the Low-Resource Language): LoRA is the Superior Strategy

Summary

Comparison with State-of-the-Art Systems

1. German → Odia (Generating the Low-Resource Language):

2. Odia → German (Generating the High-Resource Language):

Summary

How to Use

Intended Use

Limitations & Bias

Achievements with Current Data Constraints

Areas for Future Improvement

Citation

Model tree for abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional

Space using abhinandansamal/nllb-200-distilled-600M-LoRA-finetuned-odia-german-bidirectional 1

1. `Odia → German` (Generating the High-Resource Language): Competitive & Efficient

2. `German → Odia` (Generating the Low-Resource Language): LoRA is the Superior Strategy

1. `German → Odia` (Generating the Low-Resource Language):

2. `Odia → German` (Generating the High-Resource Language):