ModernBERT-base Fine-tuned on CoNLL-2003 for NER
This model is a fine-tuned version of answerdotai/ModernBERT-base on the CoNLL-2003 dataset for Named Entity Recognition (NER).
ModernBERT's architecture allows for efficient processing of long sequences and features optimized attention mechanisms, making it an excellent backbone for dense token-classification tasks like NER.
Model Description
- Developed by: Rúben Garrido
- Model type: ModernBERT (Encoder-only Transformer)
- Task: Named Entity Recognition (NER)
- Labels: O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC
Intended Uses & Limitations
This model is intended for identifying entities (Persons, Organizations, Locations, and Miscellaneous) in English text.
How to use
from transformers import pipeline
ner_pipeline = pipeline("ner", model="RGarrido03/modernbert-conll2003-ner-base", aggregation_strategy="simple")
text = "The CERN headquarters are located in Geneva, Switzerland."
results = ner_pipeline(text)
for entity in results:
print(f"Entity: {entity['word']}, Label: {entity['entity_group']}, Score: {entity['score']:.4f}")
Training Data
The model was trained on the CoNLL-2003 dataset, which consists of Reuters news stories from 1996 and 1997.
- Train samples: 14,041
- Validation samples: 3,250
- Test samples: 3,453
Training Procedure
Training Hyperparameters
The following hyperparameters were used during training:
- Learning rate: 5e-5 (with AdamW optimizer)
- Batch size: 8
- Epochs: 3.0
- Weight decay: 0.01
- Warmup ratio: 0.1
- Max sequence length: 256
- Label all tokens: True (subword pieces inherit parent labels)
Training Results (Evaluation on Test Split)
| Metric | Value |
|---|---|
| Accuracy | 0.9711 |
| F1 Score | 0.8851 |
| Precision | 0.8721 |
| Recall | 0.8985 |
| Loss | 0.1873 |
Evaluation on Validation Split
| Metric | Value |
|---|---|
| Accuracy | 0.9871 |
| F1 Score | 0.9416 |
| Precision | 0.9357 |
| Recall | 0.9475 |
| Loss | 0.0625 |
Environmental Impact
- Runtime: ~11.5 minutes (694 seconds)
- Hardware: MacBook Pro, M5 Pro 24GB (Training speed: ~62 samples/sec)
Citation
If you use this model, please cite the original CoNLL-2003 paper and the ModernBERT work.
@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,
title = "Introduction to the {CoNLL}-2003 Shared Task: Language-Independent Named Entity Recognition",
author = "Tjong Kim Sang, Erik F. and De Meulder, Fien",
booktitle = "Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003",
year = "2003",
url = "https://aclanthology.org/W03-0419",
pages = "142--147",
}
- Downloads last month
- 34
Model tree for RGarrido03/modernbert-conll2003-ner-base
Base model
answerdotai/ModernBERT-baseDataset used to train RGarrido03/modernbert-conll2003-ner-base
Evaluation results
- Precision on CoNLL-2003test set self-reported0.872
- Recall on CoNLL-2003test set self-reported0.898
- F1 on CoNLL-2003test set self-reported0.885
- Accuracy on CoNLL-2003test set self-reported0.971