Inference Providers
Active filters: modelopt
nvidia/Gemma-4-31B-IT-NVFP4
Text Generation
• 21B • Updated • 292k
• 310
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
Text Generation
• 67B • Updated • 1.68M
• 258
bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4
Text Generation
• 15B • Updated • 25k
• 17
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8
Text Generation
• 124B • Updated • 1.09M
• 226
nvidia/MiniMax-M2.5-NVFP4
Text Generation
• 116B • Updated • 29.6k
• 21
nvidia/Qwen3.5-397B-A17B-NVFP4
Text Generation
• Updated • 398k
• 88
Image-Text-to-Text
• 7B • Updated • 41.9k
• 11
Alexzander85/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-NVFP4-MLP-FP8KV
Text Generation
• 8B • Updated • 1.4k
• 9
bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4A16
Text Generation
• 15B • Updated • 2.66k
• 4
nvidia/Qwen3-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 212k
• 28
NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4
Text Generation
• 16B • Updated • 15.1k
• 23
Text Generation
• Updated • 796k
• 78
Text Generation
• 183B • Updated • 3.19k
• 13
cosmicproc/gemma-4-E4B-it-NVFP4
Image-Text-to-Text
• 6B • Updated • 4.36k
• 3
nvidia/Phi-4-multimodal-instruct-NVFP4
4B • Updated • 1.88k
• 10
Text Generation
• 8B • Updated • 271k
• 7
nvidia/Qwen3-Next-80B-A3B-Instruct-NVFP4
Text Generation
• Updated • 20.2k
• 37
chankhavu/Nemotron-Cascade-2-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 13.8k
• 9
Neural-ICE/Gemma-4-26B-A4B-it-NVFP4
Text Generation
• 15B • Updated • 5
• 2
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 63.9k
• 29
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.24k
• 8
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 109k
• 8
Text Generation
• 5B • Updated • 30.6k
• 16
Text Generation
• 17B • Updated • 144k
• 14
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 11.4k
• 14
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation
• Updated • 7.15k
• 30
nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4
Text Generation
• Updated • 18k
• 55
nvidia/Qwen3-235B-A22B-Thinking-2507-NVFP4
Text Generation
• Updated • 765
• 7
nvidia/Qwen3-235B-A22B-Instruct-2507-NVFP4
Text Generation
• 120B • Updated • 3.13k
• 7
nvidia/Qwen3-Coder-480B-A35B-Instruct-NVFP4
Text Generation
• 241B • Updated • 2k
• 10