GLiNER ContractNER Multi - Fine-Grained Legal Entity Extraction
Model Name: gliner-contractner-multi-v2.1 (Agile Lab Fine-tune)
Base Architecture: GLiNER Multi v2.1 (Backbone: microsoft/mdeberta-v3-base)
Model Description
GLiNER ContractNER Multi is a multilingual span-based Named Entity Recognition (NER) model fine-tuned by Agile Lab on the ContractNER dataset. It is designed to extract fine-grained entities from legal contracts with high precision.
Built on the GLiNER Multi v2.1 architecture, this model achieves 80%+ F1 score on contract-specific entity extraction, significantly outperforming general-purpose LLMs and domain-specific legal models in our benchmarks.
Key Highlights
- Contract-Specialized: Fine-tuned on 3,240+ annotated contract chunks from SEC EDGAR filings.
- Granular Extraction: Capable of identifying 18 specific entity types including parties, dates, financial terms (salaries, shares), and regulatory references.
- Open-Vocabulary NER: Supports promptable entity extractionβyou can provide custom label names at inference time without retraining.
- Multilingual Capability: Inherits multilingual behavior from GLiNER Multi v2.1 and mDeBERTa-v3-base, though optimized primarily for English contracts (performance may degrade on low-resource languages).
- Production-Ready: A recommended threshold of 0.8β0.9 balances high precision with acceptable recall, minimizing costly false positives in legal review workflows.
π How to Use
To use this model, you need to install the gliner library.
Installation
pip install gliner
Inference Code
from gliner import GLiNER
# Load the model (replace 'AgileLab/your-model-repo' with your actual HF repo ID)
model = GLiNER.from_pretrained("AgileLab/gliner-contractner-multi-v2.1")
# Example contract text
text = """
This EMPLOYMENT AGREEMENT is made effective as of January 1, 2026,
by and between Tech Solutions Inc. ("Company") and Jane Doe ("Executive").
The Executive shall serve as Chief Technology Officer.
The Company agrees to pay the Executive an annual base salary of $250,000.00.
"""
# Define the entities you want to extract (Open Vocabulary)
labels = [
"Parties", "EffectiveDate", "Role", "Salary", "TerminationDate"
]
# Predict
entities = model.predict_entities(text, labels, threshold=0.5)
# Print results
for entity in entities:
print(f"{entity['text']} => {entity['label']} (Score: {entity['score']:.2f})")
π Evaluation & Benchmarks
In our comprehensive evaluation on the validation set, this model achieved an overall F1 score of 80.0%, demonstrating a strong balance between precision and recall.
Performance vs. Other Models
- GLiNER ContractNER (This Model): 80.0% F1
- General Purpose LLMs (Qwen, Gemma): < 35% F1
- Standalone DeBERTa Models: 46% β 78% F1
- Legal-Specific Models (LegalBERT, ContractBERT): < 10% F1
This model is best-in-class for contract entity extraction when utilizing the GLiNER span+query architecture.
Detailed Metrics by Entity (Validation Split)
| Entity | Precision | Recall | F1 Score | Support |
|---|---|---|---|---|
| Act | 81.16 | 74.67 | 77.78 | 75 |
| Address | 68.00 | 77.27 | 72.34 | 22 |
| Court | 80.00 | 80.00 | 80.00 | 20 |
| EffectiveDate | 62.50 | 96.15 | 75.76 | 26 |
| PII_Ref | 77.27 | 100.00 | 87.18 | 17 |
| Parties | 70.13 | 85.71 | 77.14 | 63 |
| Percentage | 59.46 | 91.67 | 72.13 | 24 |
| Price | 42.50 | 94.44 | 58.62 | 18 |
| Principal | 36.25 | 90.62 | 51.79 | 32 |
| Ratio | 29.79 | 73.68 | 42.42 | 19 |
| Regulation | 66.67 | 88.37 | 76.00 | 43 |
| RenewalTerm | 42.86 | 75.00 | 54.55 | 12 |
| Rent | 30.00 | 75.00 | 42.86 | 8 |
| Role | 90.32 | 80.00 | 84.85 | 35 |
| Salary | 42.86 | 100.00 | 60.00 | 18 |
| Shares | 40.48 | 89.47 | 55.74 | 19 |
| TerminationDate | 23.64 | 72.22 | 35.62 | 18 |
| Title | 70.11 | 73.49 | 71.76 | 83 |
Supported Entity Schema
The model was trained on the ContractNER schema. While you can use custom labels, performance is best with categories semantically similar to:
Document Metadata
- EffectiveDate: Contract start date (e.g., "January 1, 2026").
- TerminationDate: Contract end or expiration date.
- RenewalTerm: Renewal periods or conditions.
- Title: Official document title.
Actors & Roles
- Parties: Legal entities entering the agreement (companies, individuals).
- Role: Professional titles and positions (e.g., "Chief Executive Officer").
Contact Information
- Address: Physical addresses.
- PII_Ref: Personal identifiable information references (phone, email, fax).
Financial Values
- Salary: Compensation amounts (always with currency symbol, e.g., "$225,000.00").
- Price: Goods/services prices.
- Principal: Loan principal amounts.
- Shares: Stock or equity quantities.
- Percentage: Percentage values (e.g., "50%").
- Ratio: Financial ratios.
- Rent: Lease or rental amounts.
Legal and Regulatory
- Court: Judicial bodies and tribunals (e.g., "State of Texas").
- Act: Legislative acts and laws.
- Regulation: Regulatory references (e.g., "Rule 10b5-1").
Training Details
Data Source & Preprocessing
- Dataset: ContractNER corpus (Adibhatla et al., 2023) β Real contracts from SEC EDGAR (U.S. Securities and Exchange Commission filings).
- Original Size: ~5,000+ annotated contract segments.
- Consolidated Dataset: ~3,240 chunks after stratified reduction and class consolidation.
- Adjustments:
- Removed
RevolvingCreditclass (too rare and ambiguous). - Rebalanced dataset to ensure minimum representation per class.
- Split: 80% training / 20% validation (random split).
- Methodology: Human-in-the-loop iterative labeling.
- Removed
Architecture & Configuration
- Base Model: GLiNER Multi v2.1 (209M parameters).
- Encoder Backbone:
microsoft/mdeberta-v3-base(86M backbone + 190M embedding parameters). - Architecture Type: Span-based NER with entity-query matching.
- Hardware: NVIDIA L4 GPU.
- Training Time: ~30 minutes per fine-tuning run.
Visualizations
Loss Curves
Model Comparison
License
Apache 2.0
- Downloads last month
- 21
Model tree for lucasorrentino/Contractner
Base model
urchade/gliner_multi-v2.1
