Note
This tutorial can be used interactively with Google Colab! You can also click here to run the Jupyter notebook locally.
3. microTVM Ahead-of-Time (AOT) Compilation¶
Authors: Mehrdad Hessar, Alan MacDonald
This tutorial is showcasing microTVM host-driven AoT compilation with a TFLite model. AoTExecutor reduces the overhead of parsing graph at runtime compared to GraphExecutor. Also, we can have better memory management using ahead of time compilation. This tutorial can be executed on a x86 CPU using C runtime (CRT) or on Zephyr platform on a microcontroller/board supported by Zephyr.
Install microTVM Python dependencies¶
TVM does not include a package for Python serial communication, so we must install one before using microTVM. We will also need TFLite to load models.
%%shell pip install pyserial==3.5 tflite==2.1
import os
# By default, this tutorial runs on x86 CPU using TVM's C runtime. If you would like
# to run on real Zephyr hardware, you must export the `TVM_MICRO_USE_HW` environment
# variable. Otherwise (if you are using the C runtime), you can skip installing
# Zephyr. It takes ~20 minutes to install Zephyr.
use_physical_hw = bool(os.getenv("TVM_MICRO_USE_HW"))
Install Zephyr¶
%%shell # Install west and ninja python3 -m pip install west apt-get install -y ninja-build # Install ZephyrProject ZEPHYR_PROJECT_PATH="/content/zephyrproject" export ZEPHYR_BASE=${ZEPHYR_PROJECT_PATH}/zephyr west init ${ZEPHYR_PROJECT_PATH} cd ${ZEPHYR_BASE} git checkout v3.2-branch cd .. west update west zephyr-export chmod -R o+w ${ZEPHYR_PROJECT_PATH} # Install Zephyr SDK cd /content ZEPHYR_SDK_VERSION="0.15.2" wget "https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v${ZEPHYR_SDK_VERSION}/zephyr-sdk-${ZEPHYR_SDK_VERSION}_linux-x86_64.tar.gz" tar xvf "zephyr-sdk-${ZEPHYR_SDK_VERSION}_linux-x86_64.tar.gz" mv "zephyr-sdk-${ZEPHYR_SDK_VERSION}" zephyr-sdk rm "zephyr-sdk-${ZEPHYR_SDK_VERSION}_linux-x86_64.tar.gz" # Install python dependencies python3 -m pip install -r "${ZEPHYR_BASE}/scripts/requirements.txt"
Import Python dependencies¶
import numpy as np
import pathlib
import json
import tvm
from tvm import relay
import tvm.micro.testing
from tvm.relay.backend import Executor, Runtime
from tvm.contrib.download import download_testdata
Import a TFLite model¶
To begin with, download and import a Keyword Spotting TFLite model. This model is originally from MLPerf Tiny repository. To test this model, we use samples from KWS dataset provided by Google.
Note: By default this tutorial runs on x86 CPU using CRT, if you would like to run on Zephyr platform you need to export TVM_MICRO_USE_HW environment variable.
MODEL_URL = "https://github.com/mlcommons/tiny/raw/bceb91c5ad2e2deb295547d81505721d3a87d578/benchmark/training/keyword_spotting/trained_models/kws_ref_model.tflite"
MODEL_PATH = download_testdata(MODEL_URL, "kws_ref_model.tflite", module="model")
SAMPLE_URL = "https://github.com/tlc-pack/web-data/raw/main/testdata/microTVM/data/keyword_spotting_int8_6.pyc.npy"
SAMPLE_PATH = download_testdata(SAMPLE_URL, "keyword_spotting_int8_6.pyc.npy", module="data")
tflite_model_buf = open(MODEL_PATH, "rb").read()
try:
import tflite
tflite_model = tflite.Model.GetRootAsModel(tflite_model_buf, 0)
except AttributeError:
import tflite.Model
tflite_model = tflite.Model.Model.GetRootAsModel(tflite_model_buf, 0)
input_shape = (1, 49, 10, 1)
INPUT_NAME = "input_1"
relay_mod, params = relay.frontend.from_tflite(
tflite_model, shape_dict={INPUT_NAME: input_shape}, dtype_dict={INPUT_NAME: "int8"}
)
Defining the target¶
Now we need to define the target, runtime and executor. In this tutorial, we focused on using AOT host driven executor. We use the host micro target which is for running a model on x86 CPU using CRT runtime or running a model with Zephyr platform on qemu_x86 simulator board. In the case of a physical microcontroller, we get the target model for the physical board (E.g. nucleo_l4r5zi) and change BOARD to supported Zephyr board.
# Use the C runtime (crt) and enable static linking by setting system-lib to True
RUNTIME = Runtime("crt", {"system-lib": True})
# Simulate a microcontroller on the host machine. Uses the main() from `src/runtime/crt/host/main.cc`.
# To use physical hardware, replace "host" with something matching your hardware.
TARGET = tvm.micro.testing.get_target("crt")
# Use the AOT executor rather than graph or vm executors. Don't use unpacked API or C calling style.
EXECUTOR = Executor("aot")
if use_physical_hw:
BOARD = os.getenv("TVM_MICRO_BOARD", default="nucleo_l4r5zi")
SERIAL = os.getenv("TVM_MICRO_SERIAL", default=None)
TARGET = tvm.micro.testing.get_target("zephyr", BOARD)
Compile the model¶
Now, we compile the model for the target:
with tvm.transform.PassContext(opt_level=3, config={"tir.disable_vectorize": True}):
module = tvm.relay.build(
relay_mod, target=TARGET, params=params, runtime=RUNTIME, executor=EXECUTOR
)
Create a microTVM project¶
Now that we have the compiled model as an IRModule, we need to create a firmware project to use the compiled model with microTVM. To do this, we use Project API. We have defined CRT and Zephyr microTVM template projects which are used for x86 CPU and Zephyr boards respectively.
template_project_path = pathlib.Path(tvm.micro.get_microtvm_template_projects("crt"))
project_options = {} # You can use options to provide platform-specific options through TVM.
if use_physical_hw:
template_project_path = pathlib.Path(tvm.micro.get_microtvm_template_projects("zephyr"))
project_options = {
"project_type": "host_driven",
"board": BOARD,
"serial_number": SERIAL,
"config_main_stack_size": 4096,
"zephyr_base": os.getenv("ZEPHYR_BASE", default="/content/zephyrproject/zephyr"),
}
temp_dir = tvm.contrib.utils.tempdir()
generated_project_dir = temp_dir / "project"
project = tvm.micro.generate_project(
template_project_path, module, generated_project_dir, project_options
)
Build, flash and execute the model¶
Next, we build the microTVM project and flash it. Flash step is specific to physical microcontrollers and it is skipped if it is simulating a microcontroller via the host main.cc or if a Zephyr emulated board is selected as the target. Next, we define the labels for the model output and execute the model with a sample with expected value of 6 (label: left).
project.build()
project.flash()
labels = [
"_silence_",
"_unknown_",
"yes",
"no",
"up",
"down",
"left",
"right",
"on",
"off",
"stop",
"go",
]
with tvm.micro.Session(project.transport()) as session:
aot_executor = tvm.runtime.executor.aot_executor.AotModule(session.create_aot_executor())
sample = np.load(SAMPLE_PATH)
aot_executor.get_input(INPUT_NAME).copyfrom(sample)
aot_executor.run()
result = aot_executor.get_output(0).numpy()
print(f"Label is `{labels[np.argmax(result)]}` with index `{np.argmax(result)}`")
Label is `left` with index `6`