tvm.relay.backend

Backend codegen modules for relay.

The Python interface to the Relay reference interpreter.

class tvm.relay.backend.interpreter.ConstructorValue(tag, fields, constructor)
class tvm.relay.backend.interpreter.RefValue(value)
class tvm.relay.backend.interpreter.Executor

An abstract interface for executing Relay programs.

evaluate(expr=None, binds=None)

Evaluate a Relay expression on the executor.

Parameters
  • expr (Optional[tvm.relay.Expr]) – The expression to evaluate.

  • binds (Optional[Map[tvm.relay.Var, tvm.relay.Expr]]) – Additional binding of free variable.

Returns

val – The evaluation result.

Return type

Union[function, Object]

class tvm.relay.backend.interpreter.Interpreter(mod, ctx, target)

Simple interpreter interface.

Parameters
  • mod (tvm.IRModule) – The module to support the execution.

  • ctx (tvmContext) – The runtime context to run the code on.

  • target (tvm.Target) – The target option to build the function.

optimize()

Optimize functions in a module.

Returns

opt_mod – The optimized module.

Return type

tvm.IRModule

Backend code generation engine.

class tvm.relay.backend.compile_engine.LoweredOutput(outputs, implement)

Lowered output

class tvm.relay.backend.compile_engine.CCacheKey(source_func, target)

Key in the CompileEngine.

Parameters
  • source_func (tvm.relay.Function) – The source function.

  • target (tvm.Target) – The target we want to run the function on.

class tvm.relay.backend.compile_engine.CCacheValue

Value in the CompileEngine, including usage statistics.

tvm.relay.backend.compile_engine.get_shape(shape)

Convert the shape to correct dtype and vars.

tvm.relay.backend.compile_engine.get_valid_implementations(op, attrs, inputs, out_type, target)

Get all valid implementations from the op strategy.

Note that this function doesn’t support op with symbolic input shapes.

Parameters
Returns

ret – The list of all valid op implementations.

Return type

List[relay.op.OpImplementation]

tvm.relay.backend.compile_engine.select_implementation(op, attrs, inputs, out_type, target, use_autotvm=True)

Select the best implementation from the op strategy.

If use_autotvm is True, it’ll first try to find the best implementation based on AutoTVM profile results. If no AutoTVM profile result is found, it’ll choose the implementation with highest plevel.

If use_autotvm is False, it’ll directly choose the implementation with highest plevel.

Note that this function doesn’t support op with symbolic input shapes.

Parameters
  • op (tvm.ir.Op) – Relay operator.

  • attrs (object) – The op attribute.

  • inputs (List[tvm.te.Tensor]) – Input tensors to the op.

  • out_type (relay.Type) – The output type.

  • target (tvm.target.Target) – The target to compile the op.

  • use_autotvm (bool) – Whether query AutoTVM to pick the best.

Returns

ret – The best op implementation and the corresponding output tensors.

Return type

tuple(relay.op.OpImplementation, List[tvm.te.Tensor])

class tvm.relay.backend.compile_engine.CompileEngine

CompileEngine to get lowered code.

lower(source_func, target=None)

Lower a source_func to a CachedFunc.

Parameters
Returns

cached_func – The result of lowering.

Return type

CachedFunc

jit(source_func, target=None)

JIT a source_func to a tvm.runtime.PackedFunc.

Parameters
Returns

jited_func – The result of jited function.

Return type

tvm.runtime.PackedFunc

clear()

clear the existing cached functions

items()

List items in the cache.

Returns

item_list – The list of items.

Return type

List[Tuple[CCacheKey, CCacheValue]]

dump()

Return a string representation of engine dump.

Returns

dump – The dumped string representation

Return type

str

tvm.relay.backend.compile_engine.get()

Get the global compile engine.

Returns

engine – The compile engine.

Return type

tvm.relay.backend.CompileEngine

A compiler from a Relay expression to TVM’s graph runtime.

The compiler is built from a few pieces.

First we define a compiler from a single Relay expression to the graph langauge. We require the expression to be a function. The function’s parameters correspond to the placeholder/inputs and model parameters found in the computation graph representation. The body of the function represents the computation graph.

The compiler’s output is a program in the graph language, which is composed of graph langauge is composed of Node, NodeRef, InputNode, OpNode. This “little language” represents programs in TVM’s graph format.

To connect to the graph runtime, we use a printer that converts our graph format into TVM’s JSON format. The resulting string can be loaded by contrib.graph_runtime or any other TVM runtime compatible systems.

class tvm.relay.backend.graph_runtime_codegen.GraphRuntimeCodegen(mod, target)

The compiler from Relay to the TVM runtime system.

codegen(func)

Compile a single function into a graph.

Parameters

func (tvm.relay.Expr) – The function to compile.

Returns

  • graph_json (str) – The graph json that can be consumed by runtime.

  • mod (IRModule or Dict[str, IRModule]) – The lowered functions.

  • params (Dict[str, tvm.nd.NDArray]) – Additional constant parameters.

The Relay Virtual Machine.

Implements a Python interface to compiling and executing on the Relay VM.

tvm.relay.backend.vm.compile(mod, target=None, target_host=None, params=None)

Compile the module to VM executable. A helper function for VMCompiler.

Parameters
  • mod (tvm.IRModule) – The Relay module to build.

  • target (str, tvm.target.Target, or dict of str(i.e.) – device/context name) to str/tvm.target.Target, optional For heterogeneous compilation, it is a dictionary indicating context to target mapping. For homogeneous compilation, it is a build target.

  • target_host (str or tvm.target.Target, optional) – Host compilation target, if target is device. When TVM compiles device specific program such as CUDA, we also need host(CPU) side code to interact with the driver to setup the dimensions and parameters correctly. target_host is used to specify the host side codegen target. By default, llvm is used if it is enabled, otherwise a stackvm intepreter is used.

  • params (dict of str to NDArray) – Input parameters to the graph that do not change during inference time. Used for constant folding.

Returns

exec – The VM executable that contains both library code and bytecode.

Return type

tvm.runtime.vm.Executable

class tvm.relay.backend.vm.VMCompiler

Compiler that compiles Relay module to VM executable.

set_params(params)

Set constant parameters for the model.

Parameters

params (dict of str to NDArray) – Input parameters to the graph that do not change during inference time. Used for constant folding.

get_params()

Return the updated weights.

lower(mod, target=None, target_host=None)

Lower the module to VM bytecode.

Parameters
  • mod (tvm.IRModule) – The Relay module to build.

  • target (str, tvm.target.Target, or dict of str(i.e.) – device/context name) to str/tvm.target.Target, optional For heterogeneous compilation, it is a dictionary indicating context to target mapping. For homogeneous compilation, it is a build target.

  • target_host (str or tvm.target.Target, optional) – Host compilation target, if target is device. When TVM compiles device specific program such as CUDA, we also need host(CPU) side code to interact with the driver to setup the dimensions and parameters correctly. target_host is used to specify the host side codegen target. By default, llvm is used if it is enabled, otherwise a stackvm intepreter is used.

codegen()

Generate the kernel library.

optimize(mod, target=None, params=None)

Helper method that optimizes a Relay module via VM.

Parameters
  • mod (tvm.IRModule) –

  • target (str, tvm.target.Target, or dict of str (i.e.) – device/context name) to str/tvm.target.Target, optional

  • params (dict of str to NDArray) – Input parameters to the graph that do not change during inference time. Used for constant folding.

Returns

  • mod (tvm.IRModule) – The optimized relay module.

  • params (dict) – The parameters of the final module.

get_exec()

Get the VM executable.

Returns

exec – The VM executable that contains both library code and bytecode.

Return type

tvm.runtime.vm.Executable

class tvm.relay.backend.vm.VMExecutor(mod, ctx, target)

An implementation of the executor interface for the Relay VM.

Useful interface for experimentation and debugging the VM can also be used directly from the API. supported by tvm.runtime.vm.

Parameters
  • mod (IRModule) – The module to support the execution.

  • ctx (tvmContext) – The runtime context to run the code on.

  • target (Target) – The target option to build the function.