Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

33

Full-text search

Active filters: nm-vllm

RedHatAI/TinyLlama-1.1B-Chat-v1.0-pruned2.4

Text Generation • Updated Mar 5, 2024 • 46 • 1

RedHatAI/MiniChat-2-3B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 2

RedHatAI/OpenHermes-2.5-Mistral-7B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 7

RedHatAI/OpenHermes-2.5-Mistral-7B-pruned50

Text Generation • Updated Mar 5, 2024 • 5 • 1

RedHatAI/Nous-Hermes-2-SOLAR-10.7B-pruned2.4

Text Generation • Updated Mar 5, 2024

RedHatAI/Nous-Hermes-2-Yi-34B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 2

RedHatAI/Nous-Hermes-2-Yi-34B-pruned50

Text Generation • Updated Mar 5, 2024 • 3

RedHatAI/zephyr-7b-beta-marlin

Text Generation • 1B • Updated Mar 6, 2024 • 24

RedHatAI/llama2.c-stories110M-pruned2.4

Text Generation • Updated Mar 5, 2024 • 4

RedHatAI/llama2.c-stories110M-pruned50

Text Generation • Updated Mar 5, 2024 • 851

RedHatAI/phi-2-pruned50

Text Generation • 3B • Updated Mar 5, 2024 • 5

RedHatAI/TinyLlama-1.1B-Chat-v1.0-marlin

Text Generation • 0.3B • Updated Mar 6, 2024 • 90 • 2

RedHatAI/OpenHermes-2.5-Mistral-7B-marlin

Text Generation • 1B • Updated Mar 6, 2024 • 21 • 2

RedHatAI/Nous-Hermes-2-Yi-34B-marlin

Text Generation • 5B • Updated Mar 6, 2024 • 5 • 5

softmax/Llama-2-70b-chat-hf-marlin

Text Generation • 10B • Updated Mar 17, 2024 • 1

softmax/falcon-180B-chat-marlin

Text Generation • 26B • Updated Mar 21, 2024 • 3

dtransposed/llama2.c-stories110M-pruned50-compressed-tensors

Text Generation • Updated Apr 23, 2024 • 4

mradermacher/Nous-Hermes-2-SOLAR-10.7B-pruned2.4-GGUF

11B • Updated Apr 10, 2025 • 111

mradermacher/Nous-Hermes-2-SOLAR-10.7B-pruned2.4-i1-GGUF

11B • Updated Apr 10, 2025 • 184

tensorblock/llama2.c-stories110M-pruned50-GGUF

0.1B • Updated 23 days ago • 66

mradermacher/phi-2-pruned50-GGUF

3B • Updated Aug 1, 2025 • 143

mradermacher/llama2.c-stories110M-pruned50-GGUF

0.1B • Updated Apr 10, 2025 • 73

mradermacher/OpenHermes-2.5-Mistral-7B-pruned50-GGUF

7B • Updated Apr 10, 2025 • 64 • 1

mradermacher/MiniChat-2-3B-pruned2.4-GGUF

3B • Updated Apr 10, 2025 • 54

mradermacher/OpenHermes-2.5-Mistral-7B-pruned50-i1-GGUF

7B • Updated Apr 10, 2025 • 130

mradermacher/llama2.c-stories110M-pruned50-i1-GGUF

0.1B • Updated Apr 10, 2025 • 77

mradermacher/OpenHermes-2.5-Mistral-7B-pruned2.4-GGUF

7B • Updated Apr 10, 2025 • 60

mradermacher/OpenHermes-2.5-Mistral-7B-pruned2.4-i1-GGUF

7B • Updated Apr 10, 2025 • 105

tensorblock/OpenHermes-2.5-Mistral-7B-pruned2.4-GGUF

7B • Updated 23 days ago • 40

tensorblock/OpenHermes-2.5-Mistral-7B-pruned50-GGUF

7B • Updated 23 days ago • 80