HuaminChen commited on
Commit
b0d00e1
·
verified ·
1 Parent(s): 42d2102

Add AMD MI300X GPU reference

Browse files
Files changed (1) hide show
  1. README.md +12 -3
README.md CHANGED
@@ -12,6 +12,7 @@ tags:
12
  - text-classification
13
  - modernbert
14
  - multilingual
 
15
  base_model: jhu-clsp/mmBERT-base
16
  datasets:
17
  - llm-semantic-router/feedback-detector-dataset
@@ -39,7 +40,7 @@ model-index:
39
 
40
  # mmBERT Feedback Detector
41
 
42
- A high-performance multilingual 4-class feedback classification model fine-tuned on [mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base).
43
 
44
  ## Model Description
45
 
@@ -65,7 +66,7 @@ This model classifies user feedback into 4 categories:
65
  - **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)
66
  - **Size**: 51,694 examples (46,524 train / 5,170 validation)
67
  - **Languages**: English, Japanese, Turkish
68
- - **Labeling**: GPT-OSS-120B via vLLM
69
  - **Sources**: MultiWOZ, SGD, INSCIT, MIMICS, Hazumi, Consumer Complaints
70
 
71
  ## Training Configuration
@@ -78,7 +79,15 @@ This model classifies user feedback into 4 categories:
78
  | Learning Rate | 2e-5 |
79
  | Max Length | 512 |
80
  | Optimizer | AdamW |
81
- | Hardware | AMD MI300X (ROCm) |
 
 
 
 
 
 
 
 
82
 
83
  ## Usage
84
 
 
12
  - text-classification
13
  - modernbert
14
  - multilingual
15
+ - amd-mi300x
16
  base_model: jhu-clsp/mmBERT-base
17
  datasets:
18
  - llm-semantic-router/feedback-detector-dataset
 
40
 
41
  # mmBERT Feedback Detector
42
 
43
+ A high-performance multilingual 4-class feedback classification model fine-tuned on [mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base) using **AMD MI300X GPU**.
44
 
45
  ## Model Description
46
 
 
66
  - **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)
67
  - **Size**: 51,694 examples (46,524 train / 5,170 validation)
68
  - **Languages**: English, Japanese, Turkish
69
+ - **Labeling**: GPT-OSS-120B via vLLM on AMD MI300X
70
  - **Sources**: MultiWOZ, SGD, INSCIT, MIMICS, Hazumi, Consumer Complaints
71
 
72
  ## Training Configuration
 
79
  | Learning Rate | 2e-5 |
80
  | Max Length | 512 |
81
  | Optimizer | AdamW |
82
+
83
+ ### Hardware
84
+
85
+ | Component | Specification |
86
+ |-----------|---------------|
87
+ | **GPU** | AMD Instinct MI300X |
88
+ | **VRAM** | 192 GB HBM3 |
89
+ | **Framework** | PyTorch with ROCm |
90
+ | **Training Time** | ~2 minutes |
91
 
92
  ## Usage
93