|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
- ja |
|
|
- ko |
|
|
- vi |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-1.5B |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
--- |
|
|
# NxMobileLM-1.5B-SFT |
|
|
|
|
|
## Model Description |
|
|
`NxMobileLM-1.5B-SFT` is a fine-tuned version of the base model `Qwen2.5-1.5B`, optimized for mobile and edge applications. This model has been trained on proprietary instruction datasets curated to enhance performance in natural language understanding and generation tasks tailored to specific applications. |
|
|
|
|
|
### Key Features: |
|
|
- **Base Model:** Qwen2.5-1.5B |
|
|
- **Parameter Count:** 1.5 billion |
|
|
- **Fine-tuning Objective:** Supervised fine-tuning (SFT) on instruction datasets. |
|
|
- **Specialization:** Lightweight and efficient performance for mobile environments. |
|
|
- **Multilingual Support:** Designed to handle multiple languages effectively, enabling robust cross-lingual capabilities for diverse applications. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Training Data |
|
|
The model was fine-tuned using a proprietary dataset designed for diverse instruction-following tasks, including question answering, summarization, and dialogue. The dataset emphasizes: |
|
|
- Multi-domain generalization |
|
|
- Task-specific instruction understanding |
|
|
- Multilingual Coverage: Training data includes samples from several major languages to enhance cross-lingual understanding. |
|
|
|
|
|
### Training Configuration |
|
|
- **Framework:** PyTorch |
|
|
- **Optimizer:** AdamW |
|
|
- **Learning Rate:** 5e-5 |
|
|
- **Batch Size:** 128 |
|
|
- **Epochs:** 3 |
|
|
- **Mixed Precision:** FP16 |
|
|
|
|
|
### Evaluation |
|
|
The model was evaluated on a variety of benchmarks, demonstrating: |
|
|
- **High Accuracy:** Achieves strong performance across general natural language tasks. |
|
|
- **Efficiency:** Optimized for low-latency inference on edge devices. |
|
|
- **Multilingual Competence:** Strong performance across multiple languages, making it suitable for global applications. |
|
|
|
|
|
### Performance Comparison |
|
|
#### Open-LLM Leaderboard |
|
|
On January 15, 2025, NxMobileLM-1.5B-SFT was ranked among the top 10 edge device models with fewer than 3 billion parameters and achieved the first rank for models with under 2 billion parameters, according to the [OpenLLM leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?params=0%2C3). |
|
|
 |
|
|
|
|
|
#### P-MMEval |
|
|
To evaluate the multilingual capabilities of the model, We conducted evaluations on several benchmarks across three languages: English (en), Japanese (ja), and Vietnamese (vi). For detailed benchmark information, refer to [P-MMEval](https://huggingface.co/datasets/Qwen/P-MMEval). |
|
|
|
|
|
| Benchmark | Llama-3.2-1B-Instruct | SmolLM2-1.7B-Instruct | Qwen2.5-1.5B-Instruct | NxMobileLM-1.5B-SFT | |
|
|
|------------------|------------------------|------------------------|------------------------|---------------------| |
|
|
| mifeval-en | 44.79 | 43.75 | 50 | **57.29** | |
|
|
| mifeval-ja | 22.92 | 23.96 | 29.17 | **30.21** | |
|
|
| mifeval-vi | 30.21 | 25 | 28.12 | **46.88** | |
|
|
| mmmlu-EN-US | 35.25 | 42.5 | **45.5** | 45.25 | |
|
|
| mmmlu-JA-JP | 31.5 | 26.25 | 36.00 | **41.00** | |
|
|
| mmmlu-VI-VT | 22.75 | 22.25 | **39.00** | 38.00 | |
|
|
| xnli-en | 35.83 | 35.83 | 59.17 | **66.67** | |
|
|
| xnli-ja | 34.17 | 35.83 | 52.5 | **57.5** | |
|
|
| xnli-vi | 37.5 | 34.17 | 45.83 | **55.83** | |
|
|
| **Average** | 32.21 | 31.61 | 42.93 | **48.07** | |
|
|
|
|
|
#### LightEval |
|
|
The table below compares `NxMobileLM-1.5B-SFT` with other instruction-tuned models using various benchmarks. Results were obtained using the [lighteval](https://github.com/huggingface/lighteval) evaluation framework and are referenced from [Hugging Face TB](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct): |
|
|
|
|
|
| Metric | SmolLM2-1.7B-Instruct | Llama-1B-Instruct | Qwen2.5-1.5B-Instruct | SmolLM1-1.7B-Instruct | NxMobileLM-1.5B-SFT | |
|
|
|------------------------------|-----------------------|-------------------|-----------------------|-----------------------|-----------------| |
|
|
| IFEval (Average prompt/inst) | 56.7 | 53.5 | 47.4 | 23.1 | **64.2** | |
|
|
| HellaSwag | **66.1** | 56.1 | 60.9 | 55.5 | 63.57 | |
|
|
| ARC (Average) | **51.7** | 41.6 | 46.2 | 43.7 | 45.21 | |
|
|
| PIQA | **74.4** | 72.3 | 73.2 | 71.6 | 72.91 | |
|
|
| MMLU-Pro (MCF) | 19.3 | 12.7 | **24.2** | 11.7 | 15.43 | |
|
|
| BBH (3-shot) | 32.2 | 27.6 | **35.3** | 25.7 | 31.44 | |
|
|
| GSM8K (5-shot) | 48.2 | 26.8 | 42.8 | 4.62 | **59.51** | |
|
|
| **Average** | 49.8 | 41.5 | 47.1 | 33.7 | **50.3** | |
|
|
|
|
|
### Limitations |
|
|
While `NxMobileLM-1.5B-SFT` excels in many areas, it may not perform well on tasks outside the scope of the fine-tuned dataset. Biases inherent in the training data may also affect outcomes. |
|
|
|
|
|
## Intended Use |
|
|
`NxMobileLM-1.5B-SFT` is designed for use in: |
|
|
- Mobile virtual assistants |
|
|
- Real-time language-based applications |
|
|
- Compact edge AI solutions |
|
|
- Multilingual Scenarios: Supporting applications that require cross-lingual communication and understanding. |
|
|
|
|
|
**Misuse Warning:** The model is not intended for use in generating harmful, biased, or illegal content. |
|
|
|
|
|
## How to Use |
|
|
Here is a sample code snippet to load and use the model: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
# Load the model and tokenizer |
|
|
model_name = "NTQAI/NxMobileLM-1.5B-SFT" |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
|
|
# Example usage |
|
|
inputs = tokenizer("What is the capital of Vietnam?", return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=50) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
If you use this model in your research, please cite it as: |
|
|
|
|
|
``` |
|
|
@misc{NxMobileLM-1.5B-SFT, |
|
|
title={NxMobileLM-1.5B-SFT}, |
|
|
author={NTQAI}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/NTQAI/NxMobileLM-1.5B-SFT}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
This model is licensed under MIT. |
|
|
|
|
|
## Contact |
|
|
For questions or issues, please contact us via website: https://ntq.ai |
|
|
|