---
library_name: transformers
tags:
- qwen3
- sft
- unsloth
- philosophical
- esoteric
base_model:
- Qwen/Qwen3-4B
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
---

# Model Card for prophet-qwen3-4b-sft

<img src="https://huggingface.co/radm/prophet-qwen3-4b-sft/resolve/main/model-image.png" alt="Model Image" width="100%">

## Model Details

### Model Description

This model is a fine-tuned version of `Qwen/Qwen3-4B`. Training was conducted with **Supervised Fine-Tuning (SFT)** using the `Unsloth` library on a custom reasoning and non-reasoning dataset.

The model focuses on philosophical and esoteric topics and is multilingual. 

- **Developed by:** radm
- **Finetuned from model:** `Qwen/Qwen3-4B`
- **Model type:** Causal LM based on the Llama3 architecture
- **Language(s):** Multilingual
- **License:** Apache 2.0 (inherited from base model)

## Uses

This is reasoning model, but you can add `\n/no_think` to user prompts or system messages to switch the model's thinking mode from turn to turn.

### Out-of-Scope Use

The model is not designed for generating harmful, unethical, biased, or factually incorrect content. Performance on tasks outside its training domain (philosophical/esoteric chat) may be suboptimal.

## Bias, Risks, and Limitations

The model inherits biases from its base model (`Qwen/Qwen3-4B`) and the fine-tuning datasets. It may generate plausible-sounding but incorrect or nonsensical information, especially on complex topics. Its "understanding" is based on patterns in the data, not genuine comprehension or consciousness. Use the outputs with critical judgment.

## Training Details

### Training Data

The model was fine-tuned used the custom reasoning and non-reasoning dataset 

### Training Procedure

Training was performed using the `Unsloth` library integrated with `trl`'s `SFTTrainer`.

- **Framework:** Unsloth + SFTTrainer
- **Base Model:** `Qwen/Qwen3-4B`
- **LoRA Configuration:**
    - `r`: 768
    - `lora_alpha`: 768
    - `lora_dropout`: 0.0
    - `bias`: "none"
    - `target_modules`: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
    - `use_rslora`: True
    - `use_dora`: True
- **Precision:** Auto (bfloat16 / float16)
- **Quantization (load):** 4-bit
- **Optimizer:** Paged AdamW 8-bit
- **Learning Rate:** 2e-5
- **LR Scheduler:** Cosine
- **Warmup Steps:** 10
- **Batch Size (per device):** 1
- **Gradient Accumulation Steps:** 64 (Effective Batch Size: 64)
- **Max Sequence Length:** 4096
- **Epochs:** 1