GLiNER ContractNER Multi - Fine-Grained Legal Entity Extraction

Model Name: gliner-contractner-multi-v2.1 (Agile Lab Fine-tune) Base Architecture: GLiNER Multi v2.1 (Backbone: microsoft/mdeberta-v3-base)

Model Description

GLiNER ContractNER Multi is a multilingual span-based Named Entity Recognition (NER) model fine-tuned by Agile Lab on the ContractNER dataset. It is designed to extract fine-grained entities from legal contracts with high precision.

Built on the GLiNER Multi v2.1 architecture, this model achieves 80%+ F1 score on contract-specific entity extraction, significantly outperforming general-purpose LLMs and domain-specific legal models in our benchmarks.

Key Highlights

Contract-Specialized: Fine-tuned on 3,240+ annotated contract chunks from SEC EDGAR filings.
Granular Extraction: Capable of identifying 18 specific entity types including parties, dates, financial terms (salaries, shares), and regulatory references.
Open-Vocabulary NER: Supports promptable entity extraction—you can provide custom label names at inference time without retraining.
Multilingual Capability: Inherits multilingual behavior from GLiNER Multi v2.1 and mDeBERTa-v3-base, though optimized primarily for English contracts (performance may degrade on low-resource languages).
Production-Ready: A recommended threshold of 0.8–0.9 balances high precision with acceptable recall, minimizing costly false positives in legal review workflows.

🚀 How to Use

To use this model, you need to install the gliner library.

Installation

pip install gliner

Inference Code

from gliner import GLiNER

# Load the model (replace 'AgileLab/your-model-repo' with your actual HF repo ID)
model = GLiNER.from_pretrained("AgileLab/gliner-contractner-multi-v2.1")

# Example contract text
text = """
This EMPLOYMENT AGREEMENT is made effective as of January 1, 2026,
by and between Tech Solutions Inc. ("Company") and Jane Doe ("Executive").
The Executive shall serve as Chief Technology Officer.
The Company agrees to pay the Executive an annual base salary of $250,000.00.
"""

# Define the entities you want to extract (Open Vocabulary)
labels = [
    "Parties", "EffectiveDate", "Role", "Salary", "TerminationDate"
]

# Predict
entities = model.predict_entities(text, labels, threshold=0.5)

# Print results
for entity in entities:
    print(f"{entity['text']} => {entity['label']} (Score: {entity['score']:.2f})")

📊 Evaluation & Benchmarks

In our comprehensive evaluation on the validation set, this model achieved an overall F1 score of 80.0%, demonstrating a strong balance between precision and recall.

Performance vs. Other Models

GLiNER ContractNER (This Model): 80.0% F1
General Purpose LLMs (Qwen, Gemma): < 35% F1
Standalone DeBERTa Models: 46% – 78% F1
Legal-Specific Models (LegalBERT, ContractBERT): < 10% F1

This model is best-in-class for contract entity extraction when utilizing the GLiNER span+query architecture.

Detailed Metrics by Entity (Validation Split)

Entity	Precision	Recall	F1 Score	Support
Act	81.16	74.67	77.78	75
Address	68.00	77.27	72.34	22
Court	80.00	80.00	80.00	20
EffectiveDate	62.50	96.15	75.76	26
PII_Ref	77.27	100.00	87.18	17
Parties	70.13	85.71	77.14	63
Percentage	59.46	91.67	72.13	24
Price	42.50	94.44	58.62	18
Principal	36.25	90.62	51.79	32
Ratio	29.79	73.68	42.42	19
Regulation	66.67	88.37	76.00	43
RenewalTerm	42.86	75.00	54.55	12
Rent	30.00	75.00	42.86	8
Role	90.32	80.00	84.85	35
Salary	42.86	100.00	60.00	18
Shares	40.48	89.47	55.74	19
TerminationDate	23.64	72.22	35.62	18
Title	70.11	73.49	71.76	83

Supported Entity Schema

The model was trained on the ContractNER schema. While you can use custom labels, performance is best with categories semantically similar to:

Document Metadata

EffectiveDate: Contract start date (e.g., "January 1, 2026").
TerminationDate: Contract end or expiration date.
RenewalTerm: Renewal periods or conditions.
Title: Official document title.

Actors & Roles

Parties: Legal entities entering the agreement (companies, individuals).
Role: Professional titles and positions (e.g., "Chief Executive Officer").

Contact Information

Address: Physical addresses.
PII_Ref: Personal identifiable information references (phone, email, fax).

Financial Values

Salary: Compensation amounts (always with currency symbol, e.g., "$225,000.00").
Price: Goods/services prices.
Principal: Loan principal amounts.
Shares: Stock or equity quantities.
Percentage: Percentage values (e.g., "50%").
Ratio: Financial ratios.
Rent: Lease or rental amounts.

Legal and Regulatory

Court: Judicial bodies and tribunals (e.g., "State of Texas").
Act: Legislative acts and laws.
Regulation: Regulatory references (e.g., "Rule 10b5-1").

Training Details

Data Source & Preprocessing

Dataset: ContractNER corpus (Adibhatla et al., 2023) — Real contracts from SEC EDGAR (U.S. Securities and Exchange Commission filings).
Original Size: ~5,000+ annotated contract segments.
Consolidated Dataset: ~3,240 chunks after stratified reduction and class consolidation.
Adjustments:
- Removed RevolvingCredit class (too rare and ambiguous).
- Rebalanced dataset to ensure minimum representation per class.
- Split: 80% training / 20% validation (random split).
- Methodology: Human-in-the-loop iterative labeling.

Architecture & Configuration

Base Model: GLiNER Multi v2.1 (209M parameters).
Encoder Backbone: microsoft/mdeberta-v3-base (86M backbone + 190M embedding parameters).
Architecture Type: Span-based NER with entity-query matching.
Hardware: NVIDIA L4 GPU.
Training Time: ~30 minutes per fine-tuning run.

Visualizations

Loss Curves

Model Comparison

License

Apache 2.0

Downloads last month: 21

Model tree for lucasorrentino/Contractner

Base model

urchade/gliner_multi-v2.1

Quantized

(3)

this model

lucasorrentino
/

Contractner