ModernBERT-base Fine-tuned on CoNLL-2003 for NER

This model is a fine-tuned version of answerdotai/ModernBERT-base on the CoNLL-2003 dataset for Named Entity Recognition (NER).

ModernBERT's architecture allows for efficient processing of long sequences and features optimized attention mechanisms, making it an excellent backbone for dense token-classification tasks like NER.

Model Description

  • Developed by: Rúben Garrido
  • Model type: ModernBERT (Encoder-only Transformer)
  • Task: Named Entity Recognition (NER)
  • Labels: O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC

Intended Uses & Limitations

This model is intended for identifying entities (Persons, Organizations, Locations, and Miscellaneous) in English text.

How to use

from transformers import pipeline

ner_pipeline = pipeline("ner", model="RGarrido03/modernbert-conll2003-ner-base", aggregation_strategy="simple")
text = "The CERN headquarters are located in Geneva, Switzerland."
results = ner_pipeline(text)

for entity in results:
    print(f"Entity: {entity['word']}, Label: {entity['entity_group']}, Score: {entity['score']:.4f}")

Training Data

The model was trained on the CoNLL-2003 dataset, which consists of Reuters news stories from 1996 and 1997.

  • Train samples: 14,041
  • Validation samples: 3,250
  • Test samples: 3,453

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

  • Learning rate: 5e-5 (with AdamW optimizer)
  • Batch size: 8
  • Epochs: 3.0
  • Weight decay: 0.01
  • Warmup ratio: 0.1
  • Max sequence length: 256
  • Label all tokens: True (subword pieces inherit parent labels)

Training Results (Evaluation on Test Split)

Metric Value
Accuracy 0.9711
F1 Score 0.8851
Precision 0.8721
Recall 0.8985
Loss 0.1873

Evaluation on Validation Split

Metric Value
Accuracy 0.9871
F1 Score 0.9416
Precision 0.9357
Recall 0.9475
Loss 0.0625

Environmental Impact

  • Runtime: ~11.5 minutes (694 seconds)
  • Hardware: MacBook Pro, M5 Pro 24GB (Training speed: ~62 samples/sec)

Citation

If you use this model, please cite the original CoNLL-2003 paper and the ModernBERT work.

@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,
    title = "Introduction to the {CoNLL}-2003 Shared Task: Language-Independent Named Entity Recognition",
    author = "Tjong Kim Sang, Erik F.  and De Meulder, Fien",
    booktitle = "Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003",
    year = "2003",
    url = "https://aclanthology.org/W03-0419",
    pages = "142--147",
}
Downloads last month
34
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RGarrido03/modernbert-conll2003-ner-base

Finetuned
(1220)
this model

Dataset used to train RGarrido03/modernbert-conll2003-ner-base

Evaluation results