modernbert-unfair-tos

ModernBERT fine-tuned for UNFAIR-ToS classification

Model Description

This model is fine-tuned on the LexGLUE UNFAIR-ToS dataset to detect unfair clauses in Terms of Service documents.

Base Model: answerdotai/ModernBERT-base

Performance

Metric Score
Exact Match Accuracy 70.6%
Micro-F1 0.79
Precision 0.98

Risk Categories

The model classifies text into 8 risk categories:

ID Category
0 Limitation of liability
1 Unilateral termination
2 Unilateral change
3 Content removal
4 Contract by using
5 Choice of law
6 Jurisdiction
7 Arbitration

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "Agreemind/modernbert-unfair-tos"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "We reserve the right to terminate your account at any time."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.sigmoid(outputs.logits)

# Get predictions
labels = ["Limitation of liability", "Unilateral termination", "Unilateral change", 
          "Content removal", "Contract by using", "Choice of law", "Jurisdiction", "Arbitration"]
          
for label, prob in zip(labels, probs[0]):
    if prob > 0.5:
        print(f"{label}: {prob:.2%}")

Training

  • Dataset: LexGLUE UNFAIR-ToS (~5,500 samples)
  • Loss: Focal Loss with class weighting
  • Optimizer: AdamW with cosine LR schedule
  • Epochs: 15 (with early stopping)

Limitations

  • Arbitration class has lower recall (~38%) due to limited training samples
  • Optimized for English legal text

Citation

@misc{agreemind-unfair-tos,
  author = {Agreemind},
  title = {modernbert-unfair-tos},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Agreemind/modernbert-unfair-tos}
}
Downloads last month
9
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Agreemind/modernbert-unfair-tos

Finetuned
(1014)
this model

Dataset used to train Agreemind/modernbert-unfair-tos