TVM Golang Runtime for Deep Learning Deployment


Introduction

TVM is an open deep learning compiler stack to compile various deep learning models from different frameworks to CPU, GPU or specialized accelerators. TVM supports model compilation from a wide range of front ends like Tensorflow, Onnx, Keras, Mxnet, Darknet, CoreML and Caffe2. TVM compiled modules can be deployed on backends like LLVM (Javascript or WASM, AMD GPU, ARM or X86), NVidia GPU (CUDA), OpenCL and Metal.

TVM supports runtime bindings for programming languages like Javascript, Java, Python, C++… and now Golang. With a wide range of frontend, backend and runtime bindings, TVM enables developers to integrate and deploy deep learning models from a variety of frameworks to a choice of hardware via many programming languages.

The TVM import and compilation process generates a graph JSON, a module and a params. Any application that integrates the TVM runtime can load these compiled modules and perform inference. A detailed tutorial of module import and compilation using TVM can be found at tutorials.

TVM now supports deploying compiled modules through Golang. Golang applications can make use of this to deploy the deep learning models through TVM. The scope of this blog is the introduction of gotvm package, the package build process and a sample application using gotvm to load a compiled module and perform inference.

Package

The golang package gotvm is built on top of TVM’s C runtime interface. The API in this package abstracts the native C types and provides Golang compatible types. The package source can be found at gotvm.

This package leverages golang’s interface, slices, function closures and implicitly handles the necessary conversions across API calls.

image

Golang Interface over TVM Runtime

How to

As shown in the below diagram gotvm enables golang applications to integrate deep learning models from various frameworks without the hassle of understanding each framework related interface API. Developers can make use of TVM to import and compile deep learning models and generate TVM artifacts. gotvm package provides golang friendly API to load, configure, feed input and get output.

image

Import, Compile, Integrate and Deploy

TVM Compile Deep Learning Models tutorials are available to compile models from all frameworks supported by the TVM frontend. This compilation process generates the artifacts required to integrate and deploy the model on a target.

API

gotvm package provides a handful of datatypes and API functions to initialize, load and infer from a golang application. Like any other golang package we just need to import gotvm package here.

  • Module : The Module API can be used to load a TVM compiled module into TVM runtime and access any functions.
  • Value : The Value API provides helper functions to set arguments or get return values in golang types like basic types or slices.
  • Function : The Function API is useful for getting handles to functions and invoking them.
  • Array : The Array API is useful for setting and getting Tensor data via golang slice.
  • Context : The Context API contains helper functions to build backend context handles.

Example

A simple example with inline documentation of loading a compiled module and performing inference is shown below. For simplicity the error handling is ignored here, but is important in real applications.


package main

// Import compiled gotvm package.
import (
    "./gotvm"
)

// Some constants for TVM compiled model paths.
// modLib : Is the compiled library exported out of compilation.
// modJson : TVM graph JSON.
// modParams : Exported params out of TVM compilation process.
const (
    modLib    = "./libdeploy.so"
    modJSON   = "./deploy.json"
    modParams = "./deploy.params"
)

// main
func main() {
    // Some util API to query underlying TVM and DLPack version information.
    fmt.Printf("TVM Version   : v%v\n", gotvm.TVMVersion)
    fmt.Printf("DLPACK Version: v%v\n\n", gotvm.DLPackVersion)

    // Import tvm module (so).
    modp, _ := gotvm.LoadModuleFromFile(modLib)

    // Load module on tvm runtime - call tvm.graph_runtime.create
    // with module and graph JSON.
    bytes, _ := ioutil.ReadFile(modJSON)
    jsonStr := string(bytes)
    funp, _ := gotvm.GetGlobalFunction("tvm.graph_runtime.create")
    graphrt, _ := funp.Invoke(jsonStr, modp, (int64)(gotvm.KDLCPU), (int64)(0))
    graphmod := graphrt.AsModule()


    // Allocate input & output arrays and fill some data for input.
    tshapeIn  := []int64{1, 224, 224, 3}
    tshapeOut := []int64{1, 1001}
    inX, _ := gotvm.Empty(tshapeIn, "float32", gotvm.CPU(0))
    out, _ := gotvm.Empty(tshapeOut)
    inSlice := make([]float32, (244 * 244 * 3))
    rand.Seed(10)
    rand.Shuffle(len(inSlice), func(i, j int) {inSlice[i],
                                               inSlice[j] = rand.Float32(),
                                               rand.Float32() })
    inX.CopyFrom(inSlice)

    // Load params
    bytes, _ = ioutil.ReadFile(modParams)
    funp, _ = graphmod.GetFunction("load_params")
    funp.Invoke(bytes)


    // Set module input
    funp, _ = graphmod.GetFunction("set_input")
    funp.Invoke("input", inX)

    // Run or Execute the graph
    funp, _ = graphmod.GetFunction("run")
    funp.Invoke()

    // Get output from runtime.
    funp, _ = graphmod.GetFunction("get_output")
    funp.Invoke(int64(0), out)

    // Access output tensor data.
    outIntf, _ := out.AsSlice()
    outSlice := outIntf.([]float32)

    // outSlice here holds flattened output data as a golang slice.
}

gotvm extends the TVM packed function system to support golang function closures as packed functions. Examples available to register golang closure as TVM packed function and invoke the same across programming language barriers.

Show me the code

References