FuseChat: Knowledge Fusion of Chat Models
Paper
•
2408.07990
•
Published
•
14
A mix of four high-performing models using the new SCE merging technique. This model was created using mergekit.
Models Merged
Base model: mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated
The following YAML configuration was used:
models:
- model: Hermes-3-Llama-3.1-8B
- model: Llama-3.1-8B-Lexi-Uncensored-V2
- model: Llama-3.1-SuperNova-Lite
- model: Llama-3.1-Storm-8B
merge_method: sce
base_model: Meta-Llama-3.1-8B-Instruct-abliterated
parameters:
select_topk: 1.5
dtype: bfloat16
Detailed results can be found here! Summarized results can be found here!
| Metric | Value (%) |
|---|---|
| Average | 29.43 |
| IFEval (0-Shot) | 78.35 |
| BBH (3-Shot) | 32.55 |
| MATH Lvl 5 (4-Shot) | 16.16 |
| GPQA (0-shot) | 9.73 |
| MuSR (0-shot) | 8.20 |
| MMLU-PRO (5-shot) | 31.60 |