AIMET Quantization Tool

AIMET (AI Model Efficiency Toolkit) is a quantization tool for deep learning models such as PyTorch and ONNX. AIMET enhances the performance of deep learning models by reducing computational load and memory usage.

With AIMET, developers can quickly iterate to find the optimal quantization configuration, achieving the best balance between accuracy and latency. Developers can compile and deploy quantized models exported from AIMET on Qualcomm NPUs using QAIRT, or run them directly with ONNX-Runtime.

AIMET helps developers with:

Quantization simulation
Model quantization using Post-Training Quantization (PTQ) techniques
Quantization-Aware Training (QAT) on PyTorch models using AIMET-Torch
Visualizing and experimenting with the impact of activation values and weights on model accuracy at different precisions
Creating mixed-precision models
Exporting quantized models to deployable ONNX format

AIMET Overview

AIMET System Requirements

64-bit x86 processor
Ubuntu 22.04
Python 3.10
Nvidia GPU
Nvidia driver version 455 or higher

AIMET Installation

Create Python Environment

AIMET requires a Python 3.10 environment, which can be created using Anaconda.

tip

For Anaconda installation, refer to: Conda Install
For creating a conda Python environment, refer to: Create Environment with Specific Python Version

After installing Anaconda, create and activate a Python 3.10 environment using the terminal:

X86 Linux PC

conda create -n aimet python=3.10
conda activate aimet

Install AIMET

AIMET provides two Python packages:

AIMET-ONNX: Quantizes ONNX models using PTQ technology
X86 Linux PC
pip3 install aimet-onnx
AIMET-Torch: Perform QAT on PyTorch models
X86 Linux PC
pip3 install aimet-torch
Install jupyter-notebook

AIMET examples are provided as jupyter-notebook references. You need to install the jupyter kernel for the aimet Python environment.
X86 Linux PC
pip3 install jupyter ipykernel python3 -m ipykernel install --user --name=aimet

AIMET Usage Example

This example demonstrates PTQ (Post-Training Quantization) using the PyTorch ResNet50 object detection model, which is first converted to ONNX format and then quantized using AIMET-ONNX. For implementation details, please refer to the ResNet50 example notebook.

tip

The model exported in this example can be used for NPU porting of AIMET quantized models in the QAIRT SDK Usage Example.

Prepare the Example Notebook

Clone the AIMET Repository

X86 Linux PC

git clone https://github.com/quic/aimet.git && cd aimet

Configure PYTHONPATH

X86 Linux PC

export PYTHONPATH=$PYTHONPATH:$(pwd)

Download the Example Notebook

X86 Linux PC

cd Examples/onnx/quantization
wget https://github.com/ZIFENG278/resnet50_qairt_example/raw/refs/heads/main/notebook/quantsim-resnet50.ipynb

Download the Dataset

Prepare a calibration dataset. To reduce download time, we'll use ImageNet-Mini as a substitute for the full ImageNet dataset.

Download the ImageNet-Mini dataset from Kaggle

Run the Example Notebook

Start jupyter-notebook

X86 Linux PC

cd aimet
jupyter-notebook

tip

After starting jupyter-notebook, it will automatically open in your default browser. If it doesn't open automatically, click on the URL printed in the terminal.

Change the Kernel

On the jupyter-notebook homepage, select /Examples/onnx/quantization/quantsim-resnet50.ipynb

In the notebook's menu bar at the top left, select Kernel -> Change Kernel -> Select Kernel and choose the aimet kernel created during the AIMET installation.

Change Notebook Kernel

Update Dataset Path

Modify the DATASET_DIR path in the Dataset section to point to your downloaded ImageNet-Mini dataset folder.

DATASET_DIR = '<ImageNet-Mini Path>'  # Please replace this with a real directory

Run the Entire Notebook

In the notebook's menu bar at the top left, select Run -> Run All Cells to execute the entire notebook.

Run All Cells

The exported resnet50 model files will be saved in the aimet_quant folder, with the outputs being resnet50.onnx and resnet50.encodings.

Deploying AIMET Models

AIMET exports models from different frameworks into the specified file formats as shown in the table below:

Framework	Format
PyTorch	.onnx
ONNX	.onnx
TensorFlow	.h5 or .pb

You can use the QAIRT tool to deploy the quantized output files from AIMET to edge devices. For the deployment process, please refer to:

QAIRT Model Conversion Example

Complete Documentation

For more detailed documentation about AIMET, please refer to

More Examples

For more AIMET examples, please refer to:

Example Notebooks

AIMET Installation​

Create Python Environment​

Install AIMET​

AIMET Usage Example​

Prepare the Example Notebook​

Clone the AIMET Repository​

Configure PYTHONPATH​

Download the Example Notebook​

Download the Dataset​

Run the Example Notebook​

Start jupyter-notebook​

Change the Kernel​

Update Dataset Path​

Run the Entire Notebook​

Deploying AIMET Models​

Complete Documentation​

More Examples​

AIMET Installation

Create Python Environment

Install AIMET

AIMET Usage Example

Prepare the Example Notebook

Clone the AIMET Repository

Configure PYTHONPATH

Download the Example Notebook

Download the Dataset

Run the Example Notebook

Start jupyter-notebook

Change the Kernel

Update Dataset Path

Run the Entire Notebook

Deploying AIMET Models

Complete Documentation

More Examples