Model Quantization Notebook
This notebook converts a pre-trained Keras violence detection model into TensorFlow Lite (TFLite) format using three different quantization strategies, making it suitable for deployment on edge/mobile devices.
Overview
| Property | Details |
|---|---|
| Framework | TensorFlow / TFLite |
| Base Model | modelv2.keras β a Keras video violence detection model |
| Input Shape | (1, 16, 224, 224, 3) β batch Γ frames Γ height Γ width Γ channels |
| Architecture | CNN + LSTM (contains dynamic LSTM loops) |
| Platform | Kaggle (GPU hidden to avoid CuDNN conflicts) |
Quantization Methods
A β Dynamic Range Quantization
- Output file:
model_dynamic_quant.tflite - Quantizes weights from float32 to int8 at conversion time.
- Activations are quantized dynamically at inference time.
- Fastest to convert; no calibration data required.
- Good balance between size reduction and accuracy.
B β Float16 Quantization
- Output file:
model_fp16_quant.tflite - Reduces weight precision from float32 to float16.
- Ideal for GPU-accelerated edge devices that support fp16 natively.
- Smaller model size with minimal accuracy loss.
C β Full Integer (INT8) Quantization
- Output file:
model_full_int8.tflite - Quantizes both weights and activations to int8.
- Requires a representative dataset for calibration (currently uses random dummy data β replace with real video samples for best results).
- Input and output tensors are also forced to int8.
- Smallest model size; best suited for CPU-only or microcontroller deployment.
Requirements
tensorflow
numpy
Usage
1. Load the Base Model
import tensorflow as tf
tf.config.set_visible_devices([], 'GPU') # Hide GPU to avoid CuDNN issues
model = tf.keras.models.load_model('path/to/modelv2.keras')
2. Run Quantization
Open and run the notebook cells in order:
- Cell 1β2 β Load the model
- Cell 3β4 β Dynamic range quantization β
model_dynamic_quant.tflite - Cell 5β6 β Float16 quantization β
model_fp16_quant.tflite - Cell 7β8 β Full INT8 quantization β
model_full_int8.tflite
Important Notes
Representative dataset: The INT8 quantization cell uses random dummy data for calibration. For production use, replace
dummy_datainrepresentative_data_gen()with real video frames from your training set to get accurate quantization ranges.LSTM compatibility flags: The model contains dynamic LSTM loops. The following flags are set in all conversion paths to prevent conversion failures:
converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS ] converter._experimental_lower_tensor_list_ops = FalseStatic input shape: The INT8 path uses
tf.functionwith atf.TensorSpecto lock the input shape to(1, 16, 224, 224, 3)before conversion β this is required for correct INT8 LSTM quantization.
Output Files
| File | Method | Precision |
|---|---|---|
model_dynamic_quant.tflite |
Dynamic Range | Weights: INT8, Activations: float32 |
model_fp16_quant.tflite |
Float16 | Weights & Activations: float16 |
model_full_int8.tflite |
Full Integer | Weights & Activations: INT8 |
- Downloads last month
- 34