Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ pipeline_tag: text-generation
|
|
| 14 |
|
| 15 |
<img src="https://huggingface.co/entfane/math_genious-7B/resolve/main/math-genious.png" width="400" height="400"/>
|
| 16 |
|
| 17 |
-
# Math
|
| 18 |
|
| 19 |
This model is a Math Chain-of-Thought fine-tuned version of Mistral 7B v0.3 Instruct model.
|
| 20 |
|
|
@@ -30,7 +30,7 @@ Model was fine-tuned on [entfane/Mixture-Of-Thoughts-Math-No-COT](https://huggin
|
|
| 30 |
|
| 31 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 32 |
|
| 33 |
-
model_name = "entfane/math-
|
| 34 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 35 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
| 36 |
messages = [
|
|
@@ -48,5 +48,5 @@ The model was evaluated on a randomly sampled subset of 1,000 records from the t
|
|
| 48 |
Math Genius 7B achieved an accuracy of 93.1% in producing the correct final answer under the pass@1 evaluation metric.
|
| 49 |
|
| 50 |
#### AIME
|
| 51 |
-
Math
|
| 52 |
The model has successfully solved 3/90 of the problems.
|
|
|
|
| 14 |
|
| 15 |
<img src="https://huggingface.co/entfane/math_genious-7B/resolve/main/math-genious.png" width="400" height="400"/>
|
| 16 |
|
| 17 |
+
# Math Genius 7B
|
| 18 |
|
| 19 |
This model is a Math Chain-of-Thought fine-tuned version of Mistral 7B v0.3 Instruct model.
|
| 20 |
|
|
|
|
| 30 |
|
| 31 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 32 |
|
| 33 |
+
model_name = "entfane/math-genius-7B"
|
| 34 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 35 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
| 36 |
messages = [
|
|
|
|
| 48 |
Math Genius 7B achieved an accuracy of 93.1% in producing the correct final answer under the pass@1 evaluation metric.
|
| 49 |
|
| 50 |
#### AIME
|
| 51 |
+
Math Genius 7B was evaluated on [90 problems from AIME 22, AIME 23, and AIME 24](https://huggingface.co/datasets/AI-MO/aimo-validation-aime).
|
| 52 |
The model has successfully solved 3/90 of the problems.
|