yeji-8b-rslora-v7

ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์ „๋ฌธ ์–ธ์–ด ๋ชจ๋ธ (ํ”„๋กœ๋•์…˜ ํ’€ ์ •๋ฐ€๋„ ๋ฒ„์ „)

Model Description

yeji-8b-rslora-v7์€ Qwen3-8B-Base๋ฅผ rsLoRA๋กœ ๋ฏธ์„ธ์กฐ์ •ํ•œ ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์ „๋ฌธ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‚ฌ์ฃผํŒ”์ž, ํƒ€๋กœ, ์„œ์–‘ ์ ์„ฑ์ˆ (ํ˜ธ๋กœ์Šค์ฝ”ํ”„) ๋“ฑ ๋‹ค์–‘ํ•œ ์šด์„ธ ๋„๋ฉ”์ธ์—์„œ ๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด ํ•ด์„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์—์„œ ์ตœ๊ณ  ํ’ˆ์งˆ์„ ์ œ๊ณตํ•˜๋Š” ํ’€ ์ •๋ฐ€๋„(FP16) ๋ฒ„์ „์ด๋ฉฐ, ๋น ๋ฅธ ์ถ”๋ก ์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์–‘์žํ™” ๋ฒ„์ „์ธ yeji-8b-rslora-v7-AWQ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”.

์ฃผ์š” ํŠน์ง•

  • ๋„๋ฉ”์ธ ์ „๋ฌธ์„ฑ: 33,528๊ฑด์˜ ๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด ์šด์„ธ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต
  • rsLoRA ์•„ํ‚คํ…์ฒ˜: ํšจ์œจ์ ์ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ (3.41% trainable params)
  • ๋ฉ€ํ‹ฐ ๋„๋ฉ”์ธ ์ง€์›: ์‚ฌ์ฃผํŒ”์ž, ํƒ€๋กœ, ํ˜ธ๋กœ์Šค์ฝ”ํ”„ ํ†ตํ•ฉ ํ•™์Šต
  • vLLM ์ตœ์ ํ™”: ํ”„๋กœ๋•์…˜ ๋ฐฐํฌ๋ฅผ ์œ„ํ•œ vLLM ์™„์ „ ํ˜ธํ™˜
  • JSON ๊ตฌ์กฐํ™” ์ถœ๋ ฅ: Qwen3์˜ ๊ฐ•๋ ฅํ•œ JSON ์ƒ์„ฑ ๋Šฅ๋ ฅ ํ™œ์šฉ

Training Details

ํ•ญ๋ชฉ ๊ฐ’
Base Model Qwen/Qwen3-8B-Base
Fine-tuning Method rsLoRA (Rank-Stabilized LoRA)
LoRA Rank (r) 64
LoRA Alpha 128
Dataset tellang/yeji-fortune-telling-ko-v3
Dataset Size 33,528 samples
Epochs 5
Trainable Parameters 174,653,440 (3.41%)
Total Parameters 5,123,014,656
Training Time ~18 hours
GPU NVIDIA A100 40GB
Precision FP16
Model Size ~16GB

ํ•™์Šต ๋ฐ์ดํ„ฐ ๊ตฌ์„ฑ

  • ์‚ฌ์ฃผํŒ”์ž (Saju): ์Œ์–‘์˜คํ–‰, ์ฒœ๊ฐ„์ง€์ง€, ์‹ญ์„ฑ ๊ธฐ๋ฐ˜ ํ•œ๊ตญ ์ „ํ†ต ๋ช…๋ฆฌํ•™
  • ํƒ€๋กœ (Tarot): ๋ฉ”์ด์ €/๋งˆ์ด๋„ˆ ์•„๋ฅด์นด๋‚˜, ์ •/์—ญ๋ฐฉํ–ฅ ์นด๋“œ ํ•ด์„
  • ํ˜ธ๋กœ์Šค์ฝ”ํ”„ (Horoscope): ์„œ์–‘ ์ ์„ฑ์ˆ  12๊ถ์œ„ ๋ฐ ํ–‰์„ฑ ์˜ํ–ฅ ๋ถ„์„

Usage

vLLM ์„œ๋ฒ„ ์‹คํ–‰ (๊ถŒ์žฅ)

# vLLM ์„œ๋ฒ„ ์‹œ์ž‘
vllm serve tellang/yeji-8b-rslora-v7 \
    --host 0.0.0.0 \
    --port 8001 \
    --dtype float16 \
    --gpu-memory-utilization 0.95 \
    --max-model-len 4096 \
    --enable-prefix-caching

# API ํ˜ธ์ถœ ์˜ˆ์‹œ (curl)
curl http://localhost:8001/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "tellang/yeji-8b-rslora-v7",
        "prompt": "<|im_start|>system\n๋‹น์‹ ์€ ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค.<|im_end|>\n<|im_start|>user\n์˜ค๋Š˜์˜ ์—ฐ์• ์šด์„ ์•Œ๋ ค์ฃผ์„ธ์š”.<|im_end|>\n<|im_start|>assistant\n",
        "max_tokens": 512,
        "temperature": 0.7,
        "top_p": 0.9
    }'

Python (transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "tellang/yeji-8b-rslora-v7"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto"
)

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "์˜ค๋Š˜์˜ ์—ฐ์• ์šด์„ ์•Œ๋ ค์ฃผ์„ธ์š”."}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Python (OpenAI SDK with vLLM)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8001/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="tellang/yeji-8b-rslora-v7",
    messages=[
        {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค."},
        {"role": "user", "content": "์˜ค๋Š˜์˜ ์—ฐ์• ์šด์„ ์•Œ๋ ค์ฃผ์„ธ์š”."}
    ],
    temperature=0.7,
    max_tokens=512
)

print(response.choices[0].message.content)

Intended Use

์ด ๋ชจ๋ธ์€ ๋‹ค์Œ ์šฉ๋„๋กœ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค:

  • ํ•œ๊ตญ์–ด ์šด์„ธ ํ•ด์„ ์„œ๋น„์Šค: ์‚ฌ์ฃผ, ํƒ€๋กœ, ํ˜ธ๋กœ์Šค์ฝ”ํ”„ ์ž๋™ ํ•ด์„
  • ๋Œ€ํ™”ํ˜• ์ ์ˆ  ์ฑ—๋ด‡: ์‚ฌ์šฉ์ž์™€ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ์šด์„ธ ์ƒ๋‹ด ์‹œ์Šคํ…œ
  • ์šด์„ธ ์ฝ˜ํ…์ธ  ์ƒ์„ฑ: ์ผ์ผ/์ฃผ๊ฐ„/์›”๊ฐ„ ์šด์„ธ ์ž๋™ ์ž‘์„ฑ
  • ๋„๋ฉ”์ธ ์ง€์‹ ๊ธฐ๋ฐ˜ ์ถ”์ฒœ: ์šด์„ธ ๊ธฐ๋ฐ˜ ์กฐ์–ธ ๋ฐ ๊ฐ€์ด๋˜์Šค ์ œ๊ณต

์‚ฌ์šฉ ์‚ฌ๋ก€

โœ… ๊ถŒ์žฅ ์‚ฌ์šฉ:

  • ์—”ํ„ฐํ…Œ์ธ๋จผํŠธ ๋ชฉ์ ์˜ ์šด์„ธ ์„œ๋น„์Šค
  • ์ ์ˆ  ์ „๋ฌธ๊ฐ€์˜ ๋ณด์กฐ ๋„๊ตฌ
  • ์šด์„ธ ์ฝ˜ํ…์ธ  ์ดˆ์•ˆ ์ƒ์„ฑ
  • ํ•œ๊ตญ์–ด ์šด์„ธ ๋ฐ์ดํ„ฐ ๋ถ„์„

โŒ ๋ถ€์ ์ ˆํ•œ ์‚ฌ์šฉ:

  • ์˜๋ฃŒ, ๋ฒ•๋ฅ , ๊ธˆ์œต ์กฐ์–ธ ๋Œ€์ฒด
  • ์ค‘๋Œ€ํ•œ ์ธ์ƒ ๊ฒฐ์ •์˜ ์œ ์ผํ•œ ๊ทผ๊ฑฐ
  • ํƒ€์ธ์— ๋Œ€ํ•œ ๋ถ€์ •์  ํŒ๋‹จ ๋„๊ตฌ

Limitations

  • ์–ธ์–ด: ํ•œ๊ตญ์–ด ์ „์šฉ (๋‹ค๋ฅธ ์–ธ์–ด ์ง€์› ์ œํ•œ์ )
  • ๋„๋ฉ”์ธ: ์šด์„ธ/์ ์ˆ  ํŠนํ™” (์ผ๋ฐ˜ ๋Œ€ํ™” ์„ฑ๋Šฅ ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋Œ€๋น„ ํ•˜๋ฝ ๊ฐ€๋Šฅ)
  • ๋ฌธํ™”์  ๋งฅ๋ฝ: ํ•œ๊ตญ ๋ฐ ๋™์•„์‹œ์•„ ๋ฌธํ™”๊ถŒ ์šด์„ธ ์ฒด๊ณ„ ์ค‘์‹ฌ
  • ์ •ํ™•์„ฑ: ์šด์„ธ ํ•ด์„์˜ ๊ฐ๊ด€์  ์ •ํ™•์„ฑ ๋ณด์žฅ ๋ถˆ๊ฐ€ (์—”ํ„ฐํ…Œ์ธ๋จผํŠธ ์šฉ๋„)
  • VRAM ์š”๊ตฌ์‚ฌํ•ญ: ํ’€ ์ •๋ฐ€๋„ ๋ชจ๋ธ๋กœ ์•ฝ 16GB VRAM ํ•„์š” (์–‘์žํ™” ๋ฒ„์ „ ๊ถŒ์žฅ: yeji-8b-rslora-v7-AWQ)

Model Variants

๋ชจ๋ธ ์ •๋ฐ€๋„ ํฌ๊ธฐ VRAM ์šฉ๋„
yeji-8b-rslora-v7 FP16 ~16GB ~18GB ์ตœ๊ณ  ํ’ˆ์งˆ ์ถ”๋ก 
yeji-8b-rslora-v7-AWQ W4A16 ~4GB ~6-8GB ๋น ๋ฅธ ํ”„๋กœ๋•์…˜ ๋ฐฐํฌ

Performance

rsLoRA v7 ๋ฒ„์ „์€ ์ด์ „ ๋ฒ„์ „(v5) ๋Œ€๋น„ ๋‹ค์Œ ๊ฐœ์„ ์‚ฌํ•ญ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:

  • โœ… ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ ํ–ฅ์ƒ: v2 โ†’ v3 ๋ฐ์ดํ„ฐ์…‹ ์—…๊ทธ๋ ˆ์ด๋“œ (33,528๊ฑด)
  • โœ… ์•ˆ์ •์„ฑ ๊ฐœ์„ : rsLoRA rank ์ฆ๊ฐ€ (r=32 โ†’ r=64)
  • โœ… JSON ๊ตฌ์กฐํ™” ์ถœ๋ ฅ: Qwen3 ๋ฒ ์ด์Šค์˜ JSON ์ƒ์„ฑ ๋Šฅ๋ ฅ ํ™œ์šฉ
  • โœ… ๋ฉ€ํ‹ฐ ๋„๋ฉ”์ธ ํ†ตํ•ฉ: ์‚ฌ์ฃผ/ํƒ€๋กœ/ํ˜ธ๋กœ์Šค์ฝ”ํ”„ ๋‹จ์ผ ๋ชจ๋ธ ์ฒ˜๋ฆฌ

Ethical Considerations

  • ์ด ๋ชจ๋ธ์€ ์—”ํ„ฐํ…Œ์ธ๋จผํŠธ ๋ชฉ์ ์œผ๋กœ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค
  • ์šด์„ธ ํ•ด์„์€ ๊ณผํ•™์  ๊ทผ๊ฑฐ๊ฐ€ ์—†์œผ๋ฉฐ, ์ค‘๋Œ€ํ•œ ๊ฒฐ์ •์— ์‚ฌ์šฉํ•˜์ง€ ๋งˆ์„ธ์š”
  • ๋ชจ๋ธ ์ถœ๋ ฅ์— ๋Œ€ํ•œ ๋น„ํŒ์  ์‚ฌ๊ณ ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค
  • ์‚ฌ์šฉ์ž์˜ ์‹ฌ๋ฆฌ์  ์•ˆ๋…•์„ ์ตœ์šฐ์„ ์œผ๋กœ ๊ณ ๋ คํ•˜์„ธ์š”

License

Apache 2.0 License (๋ฒ ์ด์Šค ๋ชจ๋ธ Qwen3-8B-Base์™€ ๋™์ผ)

Citation

@misc{yeji-8b-rslora-v7,
  author = {SSAFY YEJI Team},
  title = {yeji-8b-rslora-v7: Korean Fortune-Telling Language Model},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/tellang/yeji-8b-rslora-v7}}
}

@article{qwen3,
  title={Qwen3 Technical Report},
  author={Qwen Team},
  year={2024}
}

Acknowledgments

Contact

  • Team: SSAFY YEJI Team
  • Project: YEJI Fortune-Telling Service
  • Issues: GitHub Issues
Downloads last month
110
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tellang/yeji-8b-rslora-v7

Base model

Qwen/Qwen3-8B-Base
Finetuned
(338)
this model
Quantizations
1 model

Dataset used to train tellang/yeji-8b-rslora-v7