--- base_model: CohereForAI/aya-expanse-8b library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:CohereForAI/aya-expanse-8b - dpo - lora - transformers - trl ---
# Gaokerena-R This is gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical reasoning capabilities of the [aya-expanse-8b](https://huggingface.co/CohereForAI/aya-expanse-8b) model. Despite using less data, gaokerena-R outperforms our previous model, [gaokerena-V](https://huggingface.co/gaokerena/gaokerena-v1.0), which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions. - **Developed by:** [Mehrdad Ghassabi](mailto:m.ghassabi@eng.ui.ac.ir),[Sadra Hakim](mailto:uwindsor.ca),[Hamidreza Baradaran Kashani](mailto:hrb.kashani@eng.ui.ac.ir), [Pedram Rostami](mailto:pedram.rostami@ut.ac.ir),[Zahra Kazemi](mailto:zhrakazemi@mehr.ui.ac.ir) - **Model type:** Medical Language Model - **Funded by:** All researcher worked voluntarily - **Language:** Persian - **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) (non-commercial use only) - **Finetuned from model :** [Aya Expanse 8B](https://huggingface.co/CohereForAI/aya-expanse-8b) ## Model Sources - [GitHub Repository](https://github.com/Mehrdadghassabi/Gaokerena-R) - [paper](https://arxiv.org/pdf/2510.20059) ## Risks and Limitations While Gaokerena-R aims to provide accurate information, it is not a substitute for professional medical advice. The model may have limitations in: - Handling medical emergencies. - Addressing highly specialized or rare medical conditions. - Offering region-specific guidance, as the training data does not include localized Persian medical practices. ## How to Get Started with the Model Since the model has been built upon Aya, you can use this model in a single or multi-modal configuration. ### Single modal inference ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from peft.peft_model import PeftModel device = "cuda" if torch.cuda.is_available() else "cpu" dtype = torch.bfloat16 model = AutoModelForCausalLM.from_pretrained( "CohereForAI/aya-expanse-8b", torch_dtype=dtype, device_map=device ) tokenizer = AutoTokenizer.from_pretrained("CohereForAI/aya-expanse-8b") model = PeftModel.from_pretrained(model = model,model_id = "gaokerena/gaokerena-r1.0") model = model.merge_and_unload() pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) pipe_output = pipe([{"role": "user", "content": "چگونه استرس می‌تواند باعث ایجاد آفت دهان شود؟"}], max_new_tokens=1024, eos_token_id=[tokenizer.eos_token_id], do_sample=False, ) output = pipe_output[0]["generated_text"][-1]["content"] print(output) ``` ### Multi modal inference ```python from transformers import AutoProcessor, AutoModelForImageTextToText import torch from peft.peft_model import PeftModel model_id = "CohereForAI/aya-vision-8b" processor = AutoProcessor.from_pretrained(model_id) model = AutoModelForImageTextToText.from_pretrained( model_id, device_map="auto", torch_dtype=torch.float16 ) model = PeftModel.from_pretrained(model=model,model_id="gaokerena/gaokerena-v1.0") model = model.merge_and_unload() messages = [ {"role": "user", "content": [ {"type": "image", "url": "./chest-pic.jpeg"}, {"type": "text", "text": "در مورد این تصویر توضیح بده"}, ]}, ] inputs = processor.apply_chat_template( messages, padding=True, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt" ).to(model.device) gen_tokens = model.generate( **inputs, max_new_tokens=1024, do_sample=True, temperature=0.3, ) print(processor.tokenizer.decode(gen_tokens[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)) ``` ## Training Details The Gaokerena-R model was post-trained on only 11000 preferred--rejected pairs, which were synthesized by another larger AI. ## Environmental Impact - **Hardware Type:** NVIDIA H100 PCIe 80 GB GPU - **Hours used:** 1 - **Carbon Emitted:** 0.0301 KG CO2 eq. ## Bibtex if you found our model useful feel free to give us a cite! ``` @misc{Gaokerena-r1.0, title={Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training}, author={Ghassabi, Mehrdad and Hakim, Sadra and Baradaran Kashani, Hamidreza and Rostami, Pedram and Kazemi, Zahra}, year={2025} eprint={2510.20059}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```