Add AMD MI300X GPU reference
Browse files
README.md
CHANGED
|
@@ -12,6 +12,7 @@ tags:
|
|
| 12 |
- text-classification
|
| 13 |
- modernbert
|
| 14 |
- multilingual
|
|
|
|
| 15 |
base_model: jhu-clsp/mmBERT-base
|
| 16 |
datasets:
|
| 17 |
- llm-semantic-router/feedback-detector-dataset
|
|
@@ -39,7 +40,7 @@ model-index:
|
|
| 39 |
|
| 40 |
# mmBERT Feedback Detector
|
| 41 |
|
| 42 |
-
A high-performance multilingual 4-class feedback classification model fine-tuned on [mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base).
|
| 43 |
|
| 44 |
## Model Description
|
| 45 |
|
|
@@ -65,7 +66,7 @@ This model classifies user feedback into 4 categories:
|
|
| 65 |
- **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)
|
| 66 |
- **Size**: 51,694 examples (46,524 train / 5,170 validation)
|
| 67 |
- **Languages**: English, Japanese, Turkish
|
| 68 |
-
- **Labeling**: GPT-OSS-120B via vLLM
|
| 69 |
- **Sources**: MultiWOZ, SGD, INSCIT, MIMICS, Hazumi, Consumer Complaints
|
| 70 |
|
| 71 |
## Training Configuration
|
|
@@ -78,7 +79,15 @@ This model classifies user feedback into 4 categories:
|
|
| 78 |
| Learning Rate | 2e-5 |
|
| 79 |
| Max Length | 512 |
|
| 80 |
| Optimizer | AdamW |
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
## Usage
|
| 84 |
|
|
|
|
| 12 |
- text-classification
|
| 13 |
- modernbert
|
| 14 |
- multilingual
|
| 15 |
+
- amd-mi300x
|
| 16 |
base_model: jhu-clsp/mmBERT-base
|
| 17 |
datasets:
|
| 18 |
- llm-semantic-router/feedback-detector-dataset
|
|
|
|
| 40 |
|
| 41 |
# mmBERT Feedback Detector
|
| 42 |
|
| 43 |
+
A high-performance multilingual 4-class feedback classification model fine-tuned on [mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) using **AMD MI300X GPU**.
|
| 44 |
|
| 45 |
## Model Description
|
| 46 |
|
|
|
|
| 66 |
- **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)
|
| 67 |
- **Size**: 51,694 examples (46,524 train / 5,170 validation)
|
| 68 |
- **Languages**: English, Japanese, Turkish
|
| 69 |
+
- **Labeling**: GPT-OSS-120B via vLLM on AMD MI300X
|
| 70 |
- **Sources**: MultiWOZ, SGD, INSCIT, MIMICS, Hazumi, Consumer Complaints
|
| 71 |
|
| 72 |
## Training Configuration
|
|
|
|
| 79 |
| Learning Rate | 2e-5 |
|
| 80 |
| Max Length | 512 |
|
| 81 |
| Optimizer | AdamW |
|
| 82 |
+
|
| 83 |
+
### Hardware
|
| 84 |
+
|
| 85 |
+
| Component | Specification |
|
| 86 |
+
|-----------|---------------|
|
| 87 |
+
| **GPU** | AMD Instinct MI300X |
|
| 88 |
+
| **VRAM** | 192 GB HBM3 |
|
| 89 |
+
| **Framework** | PyTorch with ROCm |
|
| 90 |
+
| **Training Time** | ~2 minutes |
|
| 91 |
|
| 92 |
## Usage
|
| 93 |
|