NVIDIA-Flamingo3 / README.md
PatoFlamejanteTV's picture
Update README.md
04976e9 verified

A newer version of the Gradio SDK is available: 6.9.0

Upgrade
metadata
title: NVIDIA Flamingo3
emoji: 🦩
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: Interface for NVIDIA Audio Flamingo 3

Audio Flamingo 3 — Hugging Face Space template

This repo provides a minimal Gradio-based HF Space that interfaces with nvidia/audio-flamingo-3-hf.

How to use

  1. Upload an audio file or paste an audio URL.
  2. Write an instruction (e.g., "Transcribe the input speech.").
  3. Press Submit and wait for the model output.

Caveats & notes

  • License: The model is released under the NVIDIA OneWay Noncommercial License — use for research/demo only. See the model card for details.
  • Resource requirements: AF3 is large and benefits from GPU. Running on CPU will be very slow or not feasible in Spaces.
  • Transformers: The model page recommends a recent transformers from GitHub. If you see errors, try git+https://github.com/huggingface/transformers in requirements.txt.
  • Audio length: AF3 processes audio in 30-second windows with a 10-minute cap per sample (longer inputs are truncated).

Deployment on Hugging Face Spaces

  • Create a new Space (Gradio).
  • Push these files (app.py, requirements.txt, README.md, runtime.txt).
  • Choose a GPU-backed hardware type if you expect interactive performance.

References

  • Model card: nvidia/audio-flamingo-3-hf (Hugging Face)