--- library_name: transformers tags: - qwen3 - sft - unsloth - philosophical - esoteric base_model: - Qwen/Qwen3-4B language: - zho - eng - fra - spa - por - deu - ita - rus - jpn - kor - vie - tha - ara --- # Model Card for prophet-qwen3-4b-sft Model Image ## Model Details ### Model Description This model is a fine-tuned version of `Qwen/Qwen3-4B`. Training was conducted with **Supervised Fine-Tuning (SFT)** using the `Unsloth` library on a custom reasoning and non-reasoning dataset. The model focuses on philosophical and esoteric topics and is multilingual. - **Developed by:** radm - **Finetuned from model:** `Qwen/Qwen3-4B` - **Model type:** Causal LM based on the Llama3 architecture - **Language(s):** Multilingual - **License:** Apache 2.0 (inherited from base model) ## Uses This is reasoning model, but you can add `\n/no_think` to user prompts or system messages to switch the model's thinking mode from turn to turn. ### Out-of-Scope Use The model is not designed for generating harmful, unethical, biased, or factually incorrect content. Performance on tasks outside its training domain (philosophical/esoteric chat) may be suboptimal. ## Bias, Risks, and Limitations The model inherits biases from its base model (`Qwen/Qwen3-4B`) and the fine-tuning datasets. It may generate plausible-sounding but incorrect or nonsensical information, especially on complex topics. Its "understanding" is based on patterns in the data, not genuine comprehension or consciousness. Use the outputs with critical judgment. ## Training Details ### Training Data The model was fine-tuned used the custom reasoning and non-reasoning dataset ### Training Procedure Training was performed using the `Unsloth` library integrated with `trl`'s `SFTTrainer`. - **Framework:** Unsloth + SFTTrainer - **Base Model:** `Qwen/Qwen3-4B` - **LoRA Configuration:** - `r`: 768 - `lora_alpha`: 768 - `lora_dropout`: 0.0 - `bias`: "none" - `target_modules`: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] - `use_rslora`: True - `use_dora`: True - **Precision:** Auto (bfloat16 / float16) - **Quantization (load):** 4-bit - **Optimizer:** Paged AdamW 8-bit - **Learning Rate:** 2e-5 - **LR Scheduler:** Cosine - **Warmup Steps:** 10 - **Batch Size (per device):** 1 - **Gradient Accumulation Steps:** 64 (Effective Batch Size: 64) - **Max Sequence Length:** 4096 - **Epochs:** 1