·
AI & ML interests
None yet
Recent Activity
Organizations
Seongyun/qwen3-8b-thinking-rare-ckpt-100
8B • Updated Seongyun/qwen3-4b-thinking-rare-ckpt-109
4B • Updated Seongyun/qwen3-4b-thinking-rl-ckpt-109
4B • Updated Seongyun/qwen3-4b-thinking-rl-ckpt60
4B • Updated • 2
Seongyun/exaone_deep_2.4b_non_math_only_mcqa_format
Updated
Seongyun/math_only_mcqa_format
2B • Updated • 1
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k_6
Updated
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k_5
2B • Updated Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k_4
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_mcqa_repetition_penalty_2
Text Generation
• 2B • Updated Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_mcqa_repetition_penalty
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_math_2
Text Generation
• 2B • Updated Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_math
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k_3
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k_2
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_sample_190k
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_code_code2
Updated
Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_code_code
Text Generation
• 2B • Updated Seongyun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_pref_repetition_penalty
Text Generation
• 2B • Updated • 2
Seongyun/ds_r1_distill_qwen_1.5b_grpo_pref
Updated
Seongyun/OLMo300m-Fineweb-Edu-100bt-latest
Updated