Instructions to use rhdang/Yelp_Review with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rhdang/Yelp_Review with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="rhdang/Yelp_Review")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("rhdang/Yelp_Review", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Model Description
This model predicts the star rating (1 - 5) of a Yelp review based on its text content. It was trained using GPT-2 and BERT, with BERT achieving the best performance at 75% validation accuracy. The model addresses class imbalance using weighted loss and optimizes hyperparameters to enhance generalization.
Training Details
Dataset: Yelp Reviews dataset (100,000 samples used)
Preprocessing:
- GPT-2 Tokenizer with Byte-Pair Encoding (BPE) for rare words
- Truncation (128 tokens) and padding for uniform input size
Models Trained:
GPT-2: Fine-tuned with a custom classification head, achieving 67% validation accuracy
BERT: Fine-tuned with bidirectional attention, achieving 75% validation accuracy
Loss Function: Weighted Cross-Entropy Loss to counteract class imbalance
Limitations
Performance may degrade on highly informal or extremely short reviews
Class imbalance still affects predictions for underrepresented ratings
Model was trained on English-language reviews only