---
base_model: CohereForAI/aya-expanse-8b
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:CohereForAI/aya-expanse-8b
- dpo
- lora
- transformers
- trl
---

<div align="center">
  <img width="300" height="300" src="https://cdn-uploads.huggingface.co/production/uploads/66412c64b5f815a3ea85e59c/ELrKEzXEUIwTvXop_uqAT.png"/>
</div>

# Gaokerena-R
This is gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical
reasoning capabilities of the [aya-expanse-8b](https://huggingface.co/CohereForAI/aya-expanse-8b) model.
Despite using less data, gaokerena-R outperforms our previous model, [gaokerena-V](https://huggingface.co/gaokerena/gaokerena-v1.0),
which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions.


- **Developed by:** [Mehrdad Ghassabi](mailto:m.ghassabi@eng.ui.ac.ir),[Sadra Hakim](mailto:uwindsor.ca),[Hamidreza Baradaran Kashani](mailto:hrb.kashani@eng.ui.ac.ir), [Pedram Rostami](mailto:pedram.rostami@ut.ac.ir),[Zahra Kazemi](mailto:zhrakazemi@mehr.ui.ac.ir)
- **Model type:** Medical Language Model
- **Funded by:** All researcher worked voluntarily
- **Language:** Persian
- **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) (non-commercial use only)
- **Finetuned from model :** [Aya Expanse 8B](https://huggingface.co/CohereForAI/aya-expanse-8b)

## Model Sources

- [GitHub Repository](https://github.com/Mehrdadghassabi/Gaokerena-R)
- [paper](https://arxiv.org/pdf/2510.20059)

## Risks and Limitations

While Gaokerena-R aims to provide accurate information, it is not a substitute for professional medical advice. The model may have limitations in:

- Handling medical emergencies.
- Addressing highly specialized or rare medical conditions.
- Offering region-specific guidance, as the training data does not include localized Persian medical practices.

## How to Get Started with the Model

Since the model has been built upon Aya, you can use this model in a single or multi-modal configuration.

### Single modal inference

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft.peft_model import PeftModel

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

model = AutoModelForCausalLM.from_pretrained(
    "CohereForAI/aya-expanse-8b",
    torch_dtype=dtype,
    device_map=device
)
tokenizer = AutoTokenizer.from_pretrained("CohereForAI/aya-expanse-8b")

model = PeftModel.from_pretrained(model = model,model_id = "gaokerena/gaokerena-r1.0")
model = model.merge_and_unload()

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe_output = pipe([{"role": "user", "content": "چگونه استرس می‌تواند باعث ایجاد آفت دهان شود؟"}],
                       max_new_tokens=1024,
                       eos_token_id=[tokenizer.eos_token_id],
                       do_sample=False,
)

output = pipe_output[0]["generated_text"][-1]["content"]
print(output)
```

### Multi modal inference

```python
from transformers import AutoProcessor, AutoModelForImageTextToText
import torch
from peft.peft_model import PeftModel

model_id = "CohereForAI/aya-vision-8b"

processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(
    model_id, device_map="auto", torch_dtype=torch.float16
)
model = PeftModel.from_pretrained(model=model,model_id="gaokerena/gaokerena-v1.0")
model = model.merge_and_unload()

messages = [
    {"role": "user",
     "content": [
       {"type": "image", "url": "./chest-pic.jpeg"},
        {"type": "text", "text": "در مورد این تصویر توضیح بده"},
    ]},
    ]

inputs = processor.apply_chat_template(
    messages, padding=True, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt"
).to(model.device)

gen_tokens = model.generate(
    **inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.3,
)
print(processor.tokenizer.decode(gen_tokens[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

```

## Training Details
The Gaokerena-R model was post-trained on only 11000 preferred--rejected pairs, which were synthesized by another larger AI.

## Environmental Impact

- **Hardware Type:** NVIDIA H100 PCIe 80 GB GPU
- **Hours used:** 1
- **Carbon Emitted:** 0.0301 KG CO<sub>2</sub> eq.


## Bibtex
if you found our model useful feel free to give us a cite!
```
@misc{Gaokerena-r1.0,
  title={Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training},
  author={Ghassabi, Mehrdad and Hakim, Sadra and Baradaran Kashani, Hamidreza and Rostami, Pedram and Kazemi, Zahra},
  year={2025}
  eprint={2510.20059},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
```