Note
This tutorial can be used interactively with Google Colab! You can also click here to run the Jupyter notebook locally.
Deploy the Pretrained Model on Adreno™ with tvmc Interface¶
Author: Siva Rama Krishna
This article is a step-by-step tutorial to deploy pretrained Keras resnet50 model on Adreno™.
Besides that, you should have TVM built for Android. See the following instructions on how to build it and setup RPC environment.
import os
import tvm
import numpy as np
from tvm import relay
from tvm.driver import tvmc
from tvm.driver.tvmc.model import TVMCPackage
from tvm.contrib import utils
Configuration¶
Specify Adreno target before compiling to generate texture
leveraging kernels and get all the benefits of textures
Note: This generated example running on our x86 server for demonstration.
If running it on the Android device, we need to
specify its instruction set. Set local_demo
to False if you want
to run this tutorial with a real device over rpc.
local_demo = True
# by default on CPU target will execute.
# select 'llvm', 'opencl' and 'opencl -device=adreno'
target = "llvm"
# Change target configuration.
# Run `adb shell cat /proc/cpuinfo` to find the arch.
arch = "arm64"
target_host = "llvm -mtriple=%s-linux-android" % arch
# Auto tuning is compute and time taking task, hence disabling for default run. Please enable it if required.
is_tuning = False
tune_log = "adreno-resnet50.log"
# To enable OpenCLML accelerated operator library.
enable_clml = False
cross_compiler = (
os.getenv("ANDROID_NDK_HOME", "")
+ "/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
)
Make a Keras Resnet50 Model¶
from tensorflow.keras.applications.resnet50 import ResNet50
tmp_path = utils.tempdir()
model_file_name = tmp_path.relpath("resnet50.h5")
model = ResNet50(include_top=True, weights="imagenet", input_shape=(224, 224, 3), classes=1000)
model.save(model_file_name)
Load Model¶
Convert a model from any framework to a tvm relay module. tvmc.load supports models from any framework (like tensorflow saves_model, onnx, tflite ..etc) and auto detects the filetype.
tvmc_model = tvmc.load(model_file_name)
print(tvmc_model.mod)
# tvmc_model consists of tvmc_mode.mod which is relay module and tvmc_model.params which parms of the module.
AutoTuning¶
Now, the below api can be used for autotuning the model for any target. Tuning required RPC setup and please refer to Deploy to Adreno GPU
rpc_tracker_host = os.environ.get("TVM_TRACKER_HOST", "127.0.0.1")
rpc_tracker_port = int(os.environ.get("TVM_TRACKER_PORT", 9190))
rpc_key = "android"
rpc_tracker = rpc_tracker_host + ":" + str(rpc_tracker_port)
# Auto tuning is compute intensive and time taking task.
# It is set to False in above configuration as this script runs in x86 for demonstration.
# Please to set :code:`is_tuning` to True to enable auto tuning.
# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.
if is_tuning:
tvmc.tune(
tvmc_model,
target=target,
tuning_records=tune_log,
target_host=target_host,
hostname=rpc_tracker_host,
port=rpc_tracker_port,
rpc_key=rpc_key,
tuner="xgb",
repeat=30,
trials=3,
early_stopping=0,
)
Compilation¶
Compilation to produce tvm artifacts
# This generated example running on our x86 server for demonstration.
# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
# OpenCLML offloading will try to accelerate supported operators by using OpenCLML proprietory operator library.
# By default :code:`enable_clml` is set to False in above configuration section.
if not enable_clml:
if local_demo:
tvmc_package = tvmc.compile(
tvmc_model,
target=target,
)
else:
tvmc_package = tvmc.compile(
tvmc_model,
target=target,
target_host=target_host,
cross=cross_compiler,
tuning_records=tune_log,
)
else:
# Altrernatively, we can save the compilation output and save it as a TVMCPackage.
# This way avoids loading of compiled module without compiling again.
target = target + ", clml"
pkg_path = tmp_path.relpath("keras-resnet50.tar")
tvmc.compile(
tvmc_model,
target=target,
target_host=target_host,
cross=cross_compiler,
tuning_records=tune_log,
package_path=pkg_path,
)
# Load the compiled package
tvmc_package = TVMCPackage(package_path=pkg_path)
# tvmc_package consists of tvmc_package.lib_path, tvmc_package.graph, tvmc_package.params
# Saved TVMPackage is nothing but tar archive with mod.so, mod.json and mod.params.
Deploy & Run¶
Deploy and run the compiled model on RPC Let tvmc fill inputs using random
# Run on RPC setup
if local_demo:
result = tvmc.run(tvmc_package, device="cpu", fill_mode="random")
else:
result = tvmc.run(
tvmc_package,
device="cl",
rpc_key=rpc_key,
hostname=rpc_tracker_host,
port=rpc_tracker_port,
fill_mode="random",
)
# result is a dictionary of outputs.
print("Result:", result)