---
library_name: transformers
license: apache-2.0
base_model: mistralai/Mistral-7B-vp0.3
language: en
datasets:
- Word2Li/MiddOptimized
tags:
- llama-factory
- full
pipeline_tag: text-generation
model-index:
- name: Mistral-7B-v0.3-Middo-Alpaca-4o-mini
  results:
    - task:
        type: text-generation
      dataset:
        name: MMLU
        type: MMLU
      metrics:
        - name: weighted accuracy
          type: weighted accuracy
          value: 28.83
          verified: true
    - task:
        type: text-generation
      dataset:
        name: IFEval
        type: IFEval
      metrics:
        - name: overall accuracy
          type: overall accuracy
          value: 47.92
          verified: true
    - task:
        type: text-generation
      dataset:
        name: GSM8K
        type: GSM8K
      metrics:
        - name: accuracy
          type: accuracy
          value: 48.90
          verified: true
    - task:
        type: text-generation
      dataset:
        name: MATH
        type: MATH
      metrics:
        - name: accuracy
          type: accuracy
          value: 11.34 
          verified: true
    - task:
        type: text-generation
      dataset:
        name: HumanEval
        type: HumanEval
      metrics:
        - name: humaneval_pass@1
          type: humaneval_pass@1
          value: 35.37
          verified: true
    - task:
        type: text-generation
      dataset:
        name: MBPP
        type: MBPP
      metrics:
        - name: score
          type: score
          value: 38.40
          verified: true
    - task:
        type: text-generation
      dataset:
        name: Hellaswag
        type: Hellaswag
      metrics:
        - name: accuracy
          type: accuracy
          value: 42.63
          verified: true
    - task:
        type: text-generation
      dataset:
        name: GPQA
        type: GPQA
      metrics:
        - name: accuracy
          type: accuracy
          value: 27.27
          verified: true
metrics:
- accuracy
---

# Mistral-7B-v0.3-Middo-Alpaca-4o-mini

Paper: [Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning](https://arxiv.org/abs/2508.21589)

Code: https://github.com/Word2VecT/Middo

## Model description

This model is a fine-tuned version of [mistralai/Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3) on the [MiddOptimzed/llama_alpaca_4o_mini](https://huggingface.co/datasets/Word2Li/MiddOptimized/viewer/default/mistral_alpaca_4o_mini) dataset.

## Training and evaluation data

### Training data

Middo optimized [Word2Li/Alpaca-4o-mini](https://huggingface.co/datasets/Word2Li/Alpaca-4o-mini) on [mistralai/Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3).

### Evaluation data

- General
  - MMLU
  - IFEval
- Math
  - GSM8K
  - MATH
- Code
  - HumanEval
  - MBPP
- Reasoning
  - Hellaswag
  - GPQA

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:

- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1.0

### Training results

- epoch: 0.9995355318160706
- total_flos: 2.6430782143515853e + 18
- train_loss: 0.8186407917937382
- train_runtime: 3697.6094
- train_samples_per_second: 18.627
- train_steps_per_second: 0.073

### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.1+cu121
- Datasets 2.21.0
- Tokenizers 0.20.1