Prince Canuma's picture

Building on HF

Prince Canuma PRO

prince-canuma

·

AI & ML interests

None yet

Recent Activity

updated a model about 4 hours ago

mlx-community/Voxtral-Mini-3B-2507-bf16

updated a model 1 day ago

mlx-community/Molmo2-8B-fp16

published a model 1 day ago

mlx-community/Molmo2-8B-fp16

View all activity

Organizations

upvoted an article 6 days ago

Article

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

7 days ago

•

60

upvoted a collection 6 days ago

Nemotron Speech

Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 15 items • Updated about 13 hours ago • 14

upvoted 2 articles 6 days ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

26 days ago

•

112

Article

NVIDIA brings agents to life with DGX Spark and Reachy Mini

+1

8 days ago

•

49

upvoted a collection about 2 months ago

INTELLECT 3

5 items • Updated Nov 27, 2025 • 1

upvoted a collection 4 months ago

EmbeddingGemma

7 items • Updated Sep 4, 2025 • 3

upvoted 2 collections 5 months ago

Gemma 3-270m

20 items • Updated Aug 14, 2025 • 5

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 29 items • Updated Aug 14, 2025 • 32

upvoted 2 collections 8 months ago

Perception Encoder

17 items • Updated Jul 11, 2025 • 74

LLaMA-Omni

13 items • Updated May 17, 2025 • 19

upvoted 2 collections 9 months ago

VideoChat-R1

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning • 4 items • Updated Sep 28, 2025 • 8

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10, 2025 • 215

upvoted 7 papers 9 months ago

DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models

Paper • 2504.02882 • Published Apr 2, 2025 • 7

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 203

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7, 2025 • 110

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

Paper • 2504.00502 • Published Apr 1, 2025 • 26

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3, 2025 • 57

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Paper • 2504.00883 • Published Apr 1, 2025 • 67

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3, 2025 • 32

upvoted a collection 9 months ago

ModernBert

16 items • Updated Apr 3, 2025 • 2