Pocket TTS β GGUF Q8_0
Q8_0 quantized version of kyutai/pocket-tts-without-voice-cloning in GGUF format, for browser-based TTS inference via WebAssembly.
Model Details
| Original | GGUF Q8_0 | |
|---|---|---|
| File | tts_b6369a24.safetensors |
pocket-tts-q8_0.gguf |
| Size | 236 MB (BF16) | 128 MB |
| Format | safetensors | GGUF |
| Reduction | β | 46% |
What's included
This GGUF contains the TTS decoder pipeline only: the transformer backbone, flow matching network, mimi decoder + decoder transformer, and the DummyQuantizer output projection.
The mimi encoder (SEANet encoder, encoder transformer, downsample conv) is excluded β TTS only needs the decoder path. This saves ~52 MB (28%) compared to a full-model GGUF.
Quantization
Per-block Q8_0 quantization (block size 32): 2-byte f16 scale + 32 int8 values per block.
56 tensors quantized β all linear/projection weights in the transformer backbone, flow matching network, and mimi decoder transformer.
114 tensors kept as F32 β norms, biases, embeddings, SEANet decoder convolutions, quantizer, and resampling convolutions.
Validation SQNR: >40 dB on all tensors.
Runtime
Weights stay quantized as Q8_0 at runtime. Matmuls use a tiled WASM SIMD128 quantized matmul kernel (fork of candle) β achieving ~2x realtime on desktop (M-series Mac, Chrome).
Files
| File | Size | Description |
|---|---|---|
pocket-tts-q8_0.gguf |
128 MB | Model weights (Q8_0 + F32, decoder only) |
tokenizer.model |
58 KB | SentencePiece unigram tokenizer |
Voice embeddings are unchanged β use them from the original repo.
Usage
This model is designed for use with tts-web, a browser-based TTS engine built with Candle and WebAssembly.
Acknowledgments
Based on Kyutai's Pocket TTS β a 100M parameter text-to-speech model.
Disclaimer
This is an independent port by idle intelligence, not affiliated with or endorsed by Kyutai Labs.
License
CC-BY-4.0 (same as the original model).
- Downloads last month
- 63
8-bit
Model tree for idle-intelligence/pocket-tts-gguf
Base model
kyutai/pocket-tts-without-voice-cloning