Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

arxiv: 2305.18290

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

5,819

Full-text search

Active filters: 2305.18290

sayantan0013/ultrafeedback-binarized-preferences-cleaned_math-stack_Qwen3-0_ramp

Text Generation • 0.6B • Updated Jun 9, 2025 • 1

kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.4-beta-0.6-2-epochs

Text Generation • 3B • Updated Jun 9, 2025 • 1

lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.07

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

sayantan0013/rubi_DPO_phase_1

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

obiwit/qwen2.5-3b-dpo-coarse-no-tools

Text Generation • 3B • Updated Jun 16, 2025 • 1

obiwit/qwen2.5-3b-dpo-finegrained-no-tools

Text Generation • 3B • Updated Jun 16, 2025 • 1

kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.6-beta-0.6-2-epochs

Text Generation • 3B • Updated Jun 10, 2025 • 1

sayantan0013/Qwen_DPO_phase_1

Text Generation • 0.6B • Updated Jun 10, 2025 • 2

Hugues898/MNLP_M3_dpo_model

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.10

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

obiwit/qwen2.5-3b-dpo-vanilla-no-tools

Text Generation • 3B • Updated Jun 16, 2025

kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.8-beta-0.6-2-epochs

Text Generation • 3B • Updated Jun 10, 2025 • 1

sergioalves/6d4842a1-29c4-40b6-a2ea-65bcc11b3644

Text Generation • Updated Jun 10, 2025 • 1

sayantan0013/rubi_only_ultrafeeback_phase_2

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.13

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

AmberYifan/Qwen2.5-7B-Instruct-ultrafeedback-20k

Text Generation • 8B • Updated Jun 10, 2025 • 1 • 1

kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-1-beta-0.6-2-epochs

Text Generation • 3B • Updated Jun 10, 2025 • 2

sayantan0013/Qwen_DPO_phase_2

Text Generation • 0.6B • Updated Jun 10, 2025 • 1

sayantan0013/Qwen_DPO_phase_3

Text Generation • 0.6B • Updated Jun 10, 2025

sergioalves/31cfe109-ea20-4dd2-a1e1-65a270ccc234

Text Generation • Updated Jun 10, 2025

Cusul/SFT_Stem_2_lab_smo_01

Text Generation • 0.6B • Updated Jun 10, 2025

Cusul/SFT_Stem_2_lab_smo_03

Text Generation • 0.6B • Updated Jun 10, 2025

sayantan0013/rubi_DPO_phase_2

Text Generation • 0.6B • Updated Jun 10, 2025 • 3

sayantan0013/rubi_DPO_phase_3

Text Generation • 0.6B • Updated Jun 10, 2025

skrd3/2534acab-4b2d-4145-af32-300cf461eed4

Text Generation • Updated Jun 10, 2025 • 1

sayantan0013/rubi_DPO_ramp_phase_2

Text Generation • 0.6B • Updated Jun 10, 2025

TK47/tinyllama-sft-dpo-t1

Updated Jun 11, 2025

TK47/tinyllama-sft-dpo-t2

Updated Jun 11, 2025

sayantan0013/rubi_no_reason_phase_1

Text Generation • 0.6B • Updated Jun 11, 2025 • 4

sayantan0013/rubi_no_reason_phase_only_ultrafeed_1

Text Generation • 0.6B • Updated Jun 11, 2025 • 1