Model Card for Model ID

Model Details

Model Description

This model is a fine-tuned version of ModernBERT trained on a collection of publicly available user review datasets (hotels, restaurants, airlines, activities, and social media) to do multi label classification reviews in 3 disctinct groups: travelers with pets, travelers with children and travelers with disability. The fine-tuning was performed as part of an academic project with the goal of improving performance on review understanding. The model is intended strictly for research and educational use and is not licensed for commercial applications.

  • Developed by: Emma Lhuillery & Albin Morisseau, University of Klagenfurt

  • Model type: Transformer-based language model (ModernBERT fine-tuned for reviews' classification)

  • Language(s) (NLP): English

  • License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0) This model is released for academic research and educational purposes only. Any commercial use is strictly prohibited. Redistribution must respect the licenses of the original datasets used for training.

  • Finetuned from model [optional]: answerdotai/modernbert-base

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

The model was fine-tuned on a limited and heterogeneous set of publicly available review datasets. As a result, it may reflect biases present in online user-generated content, such as demographic, cultural, or opinion biases. The relatively small size of the fine-tuning corpus may also limit the model’s generalization capabilities and topic diversity. Outputs should therefore be interpreted with caution.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("username/model-name")
model = AutoModelForSequenceClassification.from_pretrained("username/model-name")

labels = ["child", "pet", "handicap"]

text = "This hotel was very clean and pet-friendly."
inputs = tokenizer(text, return_tensors="pt", truncation=True)

with torch.no_grad():
    logits = model(**inputs).logits
probs = torch.sigmoid(logits)[0]

predicted_labels = [label for label, p in zip(labels, probs) if p > 0.5]
print(predicted_labels)

Training Details

Training Data

The model was fine-tuned on a curated subset of publicly available review datasets originally collected from multiple online platforms. These datasets were selected from a larger pool of 15 datasets reviewed for licensing compatibility and relevance. Only datasets allowing academic or non-commercial use were included.

The original datasets are primarily available through Kaggle and other public repositories. The following sources were used in this work:

All datasets consist of user-generated review text related to travel and leisure activities.

Training Procedure

The model was fine-tuned from the answerdotai/ModernBERT-large transformer for multi-label classification of travel reviews multi-label where a review can belong to multiple categories simultaneously. Reviews were chunked into 128-token windows centered around domain-specific keywords to ensure relevant context is captured.

We split the annotated dataset into an 80/20 train/test split, maintaining class proportions across categories. The training set contained 2,834 chunks, and the test set contained 709 chunks.

Fine-tuning was performed for 3 epochs using the Unsloth framework, optimizing for multi-label classification with a decision threshold of 0.5 for each category. Mixed-precision training (FP16/BF16) and the 8-bit AdamW optimizer were used to reduce memory footprint and accelerate training.

Preprocessing [optional]

All reviews were preprocessed to ensure linguistic and semantic consistency:

  • Non-English reviews were translated into English.
  • Text chunks of 128 tokens were generated around keywords, with a maximum overlap of 25% allowed.
  • Manual annotation was applied to ensure high-quality labels.

Training Hyperparameters

The hyperparameters used for fine-tuning are summarized in Table 1:

Hyperparameter Value
Batch size (train / eval) 32 / 32
Number of epochs 3
Learning rate 5e-5
Optimizer AdamW 8-bit
Weight decay 0.01
Warmup steps 100
Learning rate scheduler Linear
Evaluation steps 10% of steps
Checkpoint strategy Save every 50 steps, max 2 checkpoints
Mixed precision FP16 / BF16 if supported
Random seed 3407
Dataloader pin memory False
Logging steps 25
Monitoring / Reporting Weights & Biases (wandb)

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a held-out test set of 709 manually chunked reviews. These reviews were sampled from the same datasets used for training (see Training Data), covering travelers with pets, children, or disabilities.

Factors

  • Review length: varied, but limited to 128-token chunks centered on keywords.
  • Category balance: proportions preserved from original datasets.
  • Multi-label classification: each chunk may belong to multiple categories.

Metrics

  • Precision, Recall, F1-Score were used as primary evaluation metrics for multi-label classification.
  • Decision threshold for positive classification: 0.5.

Results

- Precision: 83.7%
- Recall: 85.5%
- F1-Score: 84.6%

The model demonstrates high reliability in classifying reviews according to traveller type, with low false positive rates.

Summary

Model Examination [optional]

The fine-tuned model effectively identifies relevant travel reviews from noisy inputs using BERT with keyword-centered chunking. Performance may vary with the size and quality of the original datasets or different keyword definitions.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: Nvidia RTX 4070 / GPU
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

@article{lhuillery2026bertfinetuning,
  title={Fine-tuning ModernBERT for multi-label classification of travel reviews},
  author={Lhuillery, Emma and Morisseau, Albin and Zanker, Markus},
  year={2026},
  note={Unpublished}
}

APA:

Lhuillery, E., Morisseau, A., & Zanker, M. (2026). Fine-tuning ModernBERT for multi-label classification of travel reviews. Unpublished.

Glossary [optional]

  • Chunk: A segment of a review limited to 128 tokens, centered on a keyword.
  • Multi-label classification: Each input may belong to multiple categories simultaneously.

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
7
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlbinMorisseau/bert-finetuned-review

Finetuned
(1014)
this model

Paper for AlbinMorisseau/bert-finetuned-review