Pocket TTS — GGUF Q8_0

Q8_0 quantized version of kyutai/pocket-tts-without-voice-cloning in GGUF format, for browser-based TTS inference via WebAssembly.

Try the demo →

Model Details

	Original	GGUF Q8_0
File	`tts_b6369a24.safetensors`	`pocket-tts-q8_0.gguf`
Size	236 MB (BF16)	128 MB
Format	safetensors	GGUF
Reduction	—	46%

What's included

This GGUF contains the TTS decoder pipeline only: the transformer backbone, flow matching network, mimi decoder + decoder transformer, and the DummyQuantizer output projection.

The mimi encoder (SEANet encoder, encoder transformer, downsample conv) is excluded — TTS only needs the decoder path. This saves ~52 MB (28%) compared to a full-model GGUF.

Quantization

Per-block Q8_0 quantization (block size 32): 2-byte f16 scale + 32 int8 values per block.

56 tensors quantized — all linear/projection weights in the transformer backbone, flow matching network, and mimi decoder transformer.

114 tensors kept as F32 — norms, biases, embeddings, SEANet decoder convolutions, quantizer, and resampling convolutions.

Validation SQNR: >40 dB on all tensors.

Runtime

Weights stay quantized as Q8_0 at runtime. Matmuls use a tiled WASM SIMD128 quantized matmul kernel (fork of candle) — achieving ~2x realtime on desktop (M-series Mac, Chrome).

Files

File	Size	Description
`pocket-tts-q8_0.gguf`	128 MB	Model weights (Q8_0 + F32, decoder only)
`tokenizer.model`	58 KB	SentencePiece unigram tokenizer

Voice embeddings are unchanged — use them from the original repo.

Usage

This model is designed for use with tts-web, a browser-based TTS engine built with Candle and WebAssembly.

Acknowledgments

Based on Kyutai's Pocket TTS — a 100M parameter text-to-speech model.

Disclaimer

This is an independent port by idle intelligence, not affiliated with or endorsed by Kyutai Labs.

License

CC-BY-4.0 (same as the original model).

Downloads last month: 63

GGUF

Model size

0.1B params

Architecture

pocket-tts

Hardware compatibility

8-bit

Model tree for idle-intelligence/pocket-tts-gguf

Base model

kyutai/pocket-tts-without-voice-cloning

Quantized

(2)

this model