VIT: Optimized for Qualcomm Devices

VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.

This is based on the implementation of VIT found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
ONNX w8a16 Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
ONNX w8a8 Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
ONNX w8a8_mixed_int16 Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
QNN_DLC float Universal QAIRT 2.43 Download
QNN_DLC w8a16 Universal QAIRT 2.43 Download
QNN_DLC w8a8 Universal QAIRT 2.43 Download
TFLITE float Universal QAIRT 2.43, TFLite 2.17.0 Download
TFLITE w8a8 Universal QAIRT 2.43, TFLite 2.17.0 Download

For more device-specific assets and performance metrics, visit VIT on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for VIT on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.image_classification

Model Stats:

  • Model checkpoint: Imagenet
  • Input resolution: 224x224
  • Number of parameters: 86.6M
  • Model size (float): 330 MB
  • Model size (w8a16): 86.2 MB
  • Model size (w8a8): 83.2 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
VIT ONNX float Snapdragon® 8 Elite Gen 5 Mobile 3.667 ms 1 - 355 MB NPU
VIT ONNX float Snapdragon® X2 Elite 3.876 ms 170 - 170 MB NPU
VIT ONNX float Snapdragon® X Elite 11.109 ms 170 - 170 MB NPU
VIT ONNX float Snapdragon® 8 Gen 3 Mobile 7.167 ms 0 - 377 MB NPU
VIT ONNX float Qualcomm® QCS8550 (Proxy) 10.487 ms 0 - 195 MB NPU
VIT ONNX float Qualcomm® QCS9075 14.321 ms 0 - 4 MB NPU
VIT ONNX float Snapdragon® 8 Elite For Galaxy Mobile 4.987 ms 0 - 344 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Elite Gen 5 Mobile 3.797 ms 0 - 301 MB NPU
VIT ONNX w8a16 Snapdragon® X2 Elite 3.894 ms 86 - 86 MB NPU
VIT ONNX w8a16 Snapdragon® X Elite 11.261 ms 86 - 86 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Gen 3 Mobile 7.372 ms 0 - 379 MB NPU
VIT ONNX w8a16 Qualcomm® QCS6490 1108.405 ms 35 - 60 MB CPU
VIT ONNX w8a16 Qualcomm® QCS8550 (Proxy) 10.713 ms 0 - 477 MB NPU
VIT ONNX w8a16 Qualcomm® QCS9075 12.785 ms 0 - 3 MB NPU
VIT ONNX w8a16 Qualcomm® QCM6690 614.255 ms 73 - 88 MB CPU
VIT ONNX w8a16 Snapdragon® 8 Elite For Galaxy Mobile 5.306 ms 0 - 295 MB NPU
VIT ONNX w8a16 Snapdragon® 7 Gen 4 Mobile 594.848 ms 86 - 107 MB CPU
VIT ONNX w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.581 ms 0 - 351 MB NPU
VIT ONNX w8a8 Snapdragon® X2 Elite 5.106 ms 85 - 85 MB NPU
VIT ONNX w8a8 Snapdragon® X Elite 13.631 ms 85 - 85 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Gen 3 Mobile 8.8 ms 0 - 466 MB NPU
VIT ONNX w8a8 Qualcomm® QCS6490 354.711 ms 21 - 79 MB CPU
VIT ONNX w8a8 Qualcomm® QCS8550 (Proxy) 12.932 ms 0 - 100 MB NPU
VIT ONNX w8a8 Qualcomm® QCS9075 13.588 ms 0 - 3 MB NPU
VIT ONNX w8a8 Qualcomm® QCM6690 135.114 ms 14 - 32 MB CPU
VIT ONNX w8a8 Snapdragon® 8 Elite For Galaxy Mobile 7.277 ms 0 - 319 MB NPU
VIT ONNX w8a8 Snapdragon® 7 Gen 4 Mobile 128.787 ms 24 - 43 MB CPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite Gen 5 Mobile 63.145 ms 0 - 268 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® X2 Elite 53.502 ms 79 - 79 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® X Elite 174.304 ms 79 - 79 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Gen 3 Mobile 81.808 ms 67 - 429 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS6490 728.091 ms 97 - 126 MB CPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS8550 (Proxy) 101.273 ms 0 - 382 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS9075 130.774 ms 68 - 70 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCM6690 387.043 ms 103 - 124 MB CPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite For Galaxy Mobile 88.161 ms 68 - 340 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 7 Gen 4 Mobile 372.061 ms 35 - 55 MB CPU
VIT QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 4.052 ms 1 - 344 MB NPU
VIT QNN_DLC float Snapdragon® X2 Elite 4.478 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® X Elite 11.859 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® 8 Gen 3 Mobile 7.716 ms 0 - 369 MB NPU
VIT QNN_DLC float Qualcomm® QCS8275 (Proxy) 40.633 ms 1 - 339 MB NPU
VIT QNN_DLC float Qualcomm® QCS8550 (Proxy) 11.125 ms 1 - 407 MB NPU
VIT QNN_DLC float Qualcomm® SA8775P 64.028 ms 1 - 339 MB NPU
VIT QNN_DLC float Qualcomm® QCS9075 14.893 ms 1 - 3 MB NPU
VIT QNN_DLC float Qualcomm® QCS8450 (Proxy) 19.096 ms 0 - 350 MB NPU
VIT QNN_DLC float Qualcomm® SA7255P 40.633 ms 1 - 339 MB NPU
VIT QNN_DLC float Qualcomm® SA8295P 17.182 ms 1 - 332 MB NPU
VIT QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 5.295 ms 0 - 338 MB NPU
VIT TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 3.094 ms 0 - 278 MB NPU
VIT TFLITE float Snapdragon® 8 Gen 3 Mobile 5.87 ms 0 - 318 MB NPU
VIT TFLITE float Qualcomm® QCS8275 (Proxy) 35.876 ms 0 - 288 MB NPU
VIT TFLITE float Qualcomm® QCS8550 (Proxy) 8.002 ms 0 - 3 MB NPU
VIT TFLITE float Qualcomm® SA8775P 11.156 ms 0 - 288 MB NPU
VIT TFLITE float Qualcomm® QCS9075 11.649 ms 0 - 174 MB NPU
VIT TFLITE float Qualcomm® QCS8450 (Proxy) 13.857 ms 0 - 292 MB NPU
VIT TFLITE float Qualcomm® SA7255P 35.876 ms 0 - 288 MB NPU
VIT TFLITE float Qualcomm® SA8295P 13.383 ms 0 - 262 MB NPU
VIT TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 3.943 ms 0 - 291 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite Gen 5 Mobile 2.295 ms 0 - 88 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Gen 3 Mobile 4.735 ms 0 - 182 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS6490 78.182 ms 1 - 99 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8275 (Proxy) 14.341 ms 0 - 85 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8550 (Proxy) 6.776 ms 0 - 4 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8775P 7.074 ms 0 - 86 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS9075 7.561 ms 0 - 89 MB NPU
VIT TFLITE w8a8 Qualcomm® QCM6690 101.681 ms 1 - 185 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8450 (Proxy) 8.676 ms 0 - 181 MB NPU
VIT TFLITE w8a8 Qualcomm® SA7255P 14.341 ms 0 - 85 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8295P 9.695 ms 0 - 89 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite For Galaxy Mobile 3.374 ms 0 - 84 MB NPU
VIT TFLITE w8a8 Snapdragon® 7 Gen 4 Mobile 20.157 ms 1 - 74 MB NPU

License

  • The license for the original implementation of VIT can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for qualcomm/VIT