Instructions to use bowphs/PhilBerta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bowphs/PhilBerta with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="bowphs/PhilBerta")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("bowphs/PhilBerta") model = AutoModelForMaskedLM.from_pretrained("bowphs/PhilBerta") - Notebooks
- Google Colab
- Kaggle
PhilBerta
The paper Exploring Language Models for Classical Philology is the first effort to systematically provide state-of-the-art language models for Classical Philology. PhilBerta is a RoBerta-base sized, multilingual, encoder-only variant.
This model was trained using data from the Open Greek & Latin Project, the CLARIN corpus Greek Medieval Texts, the Patrologia Graeca, the Corpus Corporum, and Project Gutenberg.
Further information can be found in our paper or in our GitHub repository.
Usage
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('bowphs/PhilBerta')
model = AutoModelForMaskedLM.from_pretrained('bowphs/PhilBerta')
Please check out the awesome Hugging Face tutorials on how to fine-tune our models.
Evaluation Results
When fine-tuned on data from Universal Dependencies 2.10, PhilBerta achieves the following results on the Ancient Greek Perseus dataset:
| Task | XPoS | UPoS | UAS | LAS |
|---|---|---|---|---|
| 95.60 | 90.41 | 86.99 | 82.69 |
When fine-tuned on PoS data from EvaLatin 2022, it achieves the following results:
| Task | Classical | Cross-genre | Cross-time |
|---|---|---|---|
| 98.23 | 96.59 | 93.25 |
Contact
If you have any questions or problems, feel free to reach out.
Citation
@incollection{riemenschneiderfrank:2023,
address = "Toronto, Canada",
author = "Riemenschneider, Frederick and Frank, Anette",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23)",
note = "to appear",
pubType = "incollection",
publisher = "Association for Computational Linguistics",
title = "Exploring Large Language Models for Classical Philology",
url = "https://arxiv.org/abs/2305.13698",
year = "2023",
key = "riemenschneiderfrank:2023"
}
- Downloads last month
- 319