tvm.contrib

Contrib APIs of TVM python package.

Contrib API provides many useful not core features. Some of these are useful utilities to interact with thirdparty libraries and tools.

tvm.contrib.cblas

External function interface to BLAS libraries.

tvm.contrib.cblas.matmul(lhs, rhs, transa=False, transb=False, **kwargs)

Create an extern op that compute matrix mult of A and rhs with CrhsLAS This function serves as an example on how to call external libraries.

Parameters:

lhs (Tensor) – The left matrix operand
rhs (Tensor) – The right matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.cblas.batch_matmul(lhs, rhs, transa=False, transb=False, iterative=False, **kwargs)

Create an extern op that compute batched matrix mult of A and rhs with CBLAS This function serves as an example on how to call external libraries.

Parameters:

lhs (Tensor) – The left matrix operand
rhs (Tensor) – The right matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.clang

Util to invoke clang in the system.

tvm.contrib.clang.find_clang(required=True)

Find clang in system.

Parameters:: required (bool) – Whether it is required, runtime error will be raised if the compiler is required.
Returns:: valid_list – List of possible paths.
Return type:: list of str

Note

This function will first search clang that matches the major llvm version that built with tvm

tvm.contrib.clang.create_llvm(inputs, output=None, options=None, cc=None)

Create llvm text ir.

Parameters:

inputs (list of str) – List of input files name or code source.
output (str, optional) – Output file, if it is none a temporary file is created
options (list) – The list of additional options string.
cc (str, optional) – The clang compiler, if not specified, we will try to guess the matched clang version.

Returns:

code – The generated llvm text IR.

Return type:

str

tvm.contrib.cc

Util to invoke C/C++ compilers in the system.

tvm.contrib.cc.get_cc()

Return the path to the default C/C++ compiler.

Returns:: out – The path to the default C/C++ compiler, or None if none was found.
Return type:: Optional[str]

tvm.contrib.cc.create_shared(output, objects, options=None, cc=None, cwd=None, ccache_env=None)

Create shared library.

Parameters:

output (str) – The target shared library.
objects (List[str]) – List of object files.
options (List[str]) – The list of additional options string.
cc (Optional[str]) – The compiler command.
cwd (Optional[str]) – The current working directory.
ccache_env (Optional[Dict[str, str]]) – The environment variable for ccache. Set None to disable ccache by default.

tvm.contrib.cc.create_staticlib(output, inputs, ar=None)

Create static library.

Parameters:

output (str) – The target shared library.
inputs (List[str]) – List of inputs files. Each input file can be a tarball of objects or an object file.
ar (Optional[str]) – Path to the ar command to be used

tvm.contrib.cc.create_executable(output, objects, options=None, cc=None, cwd=None, ccache_env=None)

Create executable binary.

Parameters:

output (str) – The target executable.
objects (List[str]) – List of object files.
options (List[str]) – The list of additional options string.
cc (Optional[str]) – The compiler command.
cwd (Optional[str]) – The urrent working directory.
ccache_env (Optional[Dict[str, str]]) – The environment variable for ccache. Set None to disable ccache by default.

tvm.contrib.cc.get_global_symbol_section_map(path, *, nm=None) → Dict[str, str]

Get global symbols from a library via nm -g

Parameters:

path (str) – The library path
nm (str) – The path to nm command

Returns:

symbol_section_map – A map from defined global symbol to their sections

Return type:

Dict[str, str]

tvm.contrib.cc.get_target_by_dump_machine(compiler)

Functor of get_target_triple that can get the target triple using compiler.

Parameters:: compiler (Optional[str]) – The compiler.
Returns:: out – A function that can get target triple according to dumpmachine option of compiler.
Return type:: Callable

tvm.contrib.cc.cross_compiler(compile_func, options=None, output_format=None, get_target_triple=None, add_files=None)

Create a cross compiler function by specializing compile_func with options.

This function can be used to construct compile functions that can be passed to AutoTVM measure or export_library.

Parameters:

compile_func (Union[str, Callable[[str, str, Optional[str]], None]]) – Function that performs the actual compilation
options (Optional[List[str]]) – List of additional optional string.
output_format (Optional[str]) – Library output format.
get_target_triple (Optional[Callable]) – Function that can target triple according to dumpmachine option of compiler.
add_files (Optional[List[str]]) – List of paths to additional object, source, library files to pass as part of the compilation.

Returns:

fcompile – A compilation function that can be passed to export_library.

Return type:

Callable[[str, str, Optional[str]], None]

Examples

from tvm.contrib import cc, ndk
# export using arm gcc
mod = build_runtime_module()
mod.export_library(path_dso,
                   fcompile=cc.cross_compiler("arm-linux-gnueabihf-gcc"))
# specialize ndk compilation options.
specialized_ndk = cc.cross_compiler(
    ndk.create_shared,
    ["--sysroot=/path/to/sysroot", "-shared", "-fPIC", "-lm"])
mod.export_library(path_dso, fcompile=specialized_ndk)

tvm.contrib.cublas

External function interface to cuBLAS libraries.

tvm.contrib.cublas.matmul(lhs, rhs, transa=False, transb=False, dtype=None)

Create an extern op that compute matrix mult of A and rhs with cuBLAS

Parameters:

lhs (Tensor) – The left matrix operand
rhs (Tensor) – The right matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.cublas.batch_matmul(lhs, rhs, transa=False, transb=False, dtype=None)

Create an extern op that compute batch matrix mult of A and rhs with cuBLAS

Parameters:

lhs (Tensor) – The left matrix operand
rhs (Tensor) – The right matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.dlpack

Wrapping functions to bridge frameworks with DLPack support to TVM

tvm.contrib.dlpack.convert_func(tvm_func, tensor_type, to_dlpack_func)

Convert a tvm function into one that accepts a tensor from another: framework, provided the other framework supports DLPACK

Parameters:

tvm_func (Function) – Built tvm function operating on arrays
tensor_type (Type) – Type of the tensors of the target framework
to_dlpack_func (Function) – Function to convert the source tensors to DLPACK

tvm.contrib.dlpack.to_pytorch_func(tvm_func)

Convert a tvm function into one that accepts PyTorch tensors

Parameters:: tvm_func (Function) – Built tvm function operating on arrays
Returns:: wrapped_func – Wrapped tvm function that operates on PyTorch tensors
Return type:: Function

tvm.contrib.emcc

Util to invoke emscripten compilers in the system.

tvm.contrib.emcc.create_tvmjs_wasm(output, objects, options=None, cc='emcc', libs=None)

Create wasm that is supposed to run with the tvmjs.

Parameters:

output (str) – The target shared library.
objects (list) – List of object files.
options (str) – The additional options.
cc (str, optional) – The compile string.
libs (list) – List of user-defined library files (e.g. .bc files) to add into the wasm.

tvm.contrib.miopen

External function interface to MIOpen library.

tvm.contrib.miopen.conv2d_forward(x, w, stride_h=1, stride_w=1, pad_h=0, pad_w=0, dilation_h=1, dilation_w=1, conv_mode=0, data_type=1, group_count=1)

Create an extern op that compute 2D convolution with MIOpen

Parameters:

x (Tensor) – input feature map
w (Tensor) – convolution weight
stride_h (int) – height stride
stride_w (int) – width stride
pad_h (int) – height pad
pad_w (int) – weight pad
dilation_h (int) – height dilation
dilation_w (int) – width dilation
conv_mode (int) – 0: miopenConvolution 1: miopenTranspose
data_type (int) – 0: miopenHalf (fp16) 1: miopenFloat (fp32)
group_count (int) – number of groups

Returns:

y – The result tensor

Return type:

Tensor

tvm.contrib.miopen.softmax(x, axis=-1)

Compute softmax with MIOpen

Parameters:

x (tvm.te.Tensor) – The input tensor
axis (int) – The axis to compute softmax over

Returns:

ret – The result tensor

Return type:

tvm.te.Tensor

tvm.contrib.miopen.log_softmax(x, axis=-1)

Compute log softmax with MIOpen

Parameters:

x (tvm.te.Tensor) – The input tensor
axis (int) – The axis to compute log softmax over

Returns:

ret – The result tensor

Return type:

tvm.te.Tensor

tvm.contrib.ndk

Util to invoke NDK compiler toolchain.

tvm.contrib.ndk.create_shared(output, objects, options=None)

Create shared library.

Parameters:

output (str) – The target shared library.
objects (list) – List of object files.
options (list of str, optional) – The additional options.

tvm.contrib.ndk.create_staticlib(output, inputs)

Create static library:

Parameters:

output (str) – The target static library.
inputs (list) – List of object files or tar files

tvm.contrib.ndk.get_global_symbol_section_map(path, *, nm=None) → Dict[str, str]

Get global symbols from a library via nm -gU in NDK

Parameters:

path (str) – The library path
nm (str) – The path to nm command

Returns:

symbol_section_map – A map from defined global symbol to their sections

Return type:

Dict[str, str]

tvm.contrib.nnpack

External function interface to NNPACK libraries.

tvm.contrib.nnpack.is_available(): Check whether NNPACK is available, that is, nnp_initialize() returns nnp_status_success.

tvm.contrib.nnpack.fully_connected_inference(lhs, rhs, nthreads=1)

Create an extern op that compute fully connected of 1D tensor lhs and 2D tensor rhs with nnpack.

Parameters:

lhs (Tensor) – lhs 1D array input[input_channels] of FP32 elements
rhs (Tensor) – lhs 2D matrix kernel[output_channels][input_channels] of FP32 elements

Returns:

C – lhs 1D array out[output_channels] of FP32 elements.

Return type:

Tensor

tvm.contrib.nnpack.convolution_inference(data, kernel, bias, padding, stride, nthreads=1, algorithm=0)

Create an extern op to do inference convolution of 4D tensor data and 4D tensor kernel and 1D tensor bias with nnpack.

Parameters:

data (Tensor) – data 4D tensor input[batch][input_channels][input_height][input_width] of FP32 elements.
kernel (Tensor) – kernel 4D tensor kernel[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
bias (Tensor) – bias 1D array bias[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
padding (list) – padding A 4-dim list of [pad_top, pad_bottom, pad_left, pad_right], which indicates the padding around the feature map.
stride (list) – stride A 2-dim list of [stride_height, stride_width], which indicates the stride.

Returns:

output – output 4D tensor output[batch][output_channels][output_height][output_width] of FP32 elements.

Return type:

Tensor

tvm.contrib.nnpack.convolution_inference_without_weight_transform(data, transformed_kernel, bias, padding, stride, nthreads=1, algorithm=0)

Create an extern op to do inference convolution of 4D tensor data and 4D pre-transformed tensor kernel and 1D tensor bias with nnpack.

Parameters:

data (Tensor) – data 4D tensor input[batch][input_channels][input_height][input_width] of FP32 elements.
transformed_kernel (Tensor) – transformed_kernel 4D tensor kernel[output_channels][input_channels][tile] [tile] of FP32 elements.
bias (Tensor) – bias 1D array bias[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
padding (list) – padding A 4-dim list of [pad_top, pad_bottom, pad_left, pad_right], which indicates the padding around the feature map.
stride (list) – stride A 2-dim list of [stride_height, stride_width], which indicates the stride.

Returns:

output – output 4D tensor output[batch][output_channels][output_height][output_width] of FP32 elements.

Return type:

Tensor

tvm.contrib.nnpack.convolution_inference_weight_transform(kernel, nthreads=1, algorithm=0, dtype='float32')

Create an extern op to do inference convolution of 3D tensor data and 4D tensor kernel and 1D tensor bias with nnpack.

Parameters:: kernel (Tensor) – kernel 4D tensor kernel[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
Returns:: output – output 4D tensor output[output_channels][input_channels][tile][tile] of FP32 elements.
Return type:: Tensor

tvm.contrib.nvcc

Utility to invoke nvcc compiler in the system

tvm.contrib.nvcc.compile_cuda(code, target_format='ptx', arch=None, options=None, path_target=None)

Compile cuda code with NVCC from env.

Parameters:

code (str) – The cuda code.
target_format (str) – The target format of nvcc compiler.
arch (str) – The cuda architecture.
options (str or list of str) – The additional options.
path_target (str, optional) – Output file.

Returns:

cubin – The bytearray of the cubin

Return type:

bytearray

tvm.contrib.nvcc.find_cuda_path()

Utility function to find cuda path

Returns:: path – Path to cuda root.
Return type:: str

tvm.contrib.nvcc.get_cuda_version(cuda_path=None)

Utility function to get cuda version

Parameters:: cuda_path (Optional[str]) – Path to cuda root. If None is passed, will use find_cuda_path() as default.
Returns:: version – The cuda version
Return type:: float

tvm.contrib.nvcc.parse_compute_version(compute_version)

Parse compute capability string to divide major and minor version

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “6.0”)

Returns:

major (int) – major version number
minor (int) – minor version number

tvm.contrib.nvcc.have_fp16(compute_version)

Either fp16 support is provided in the compute capability or not

Parameters:: compute_version (str) – compute capability of a GPU (e.g. “6.0”)

tvm.contrib.nvcc.have_int8(compute_version)

Either int8 support is provided in the compute capability or not

Parameters:: compute_version (str) – compute capability of a GPU (e.g. “6.1”)

tvm.contrib.nvcc.have_tensorcore(compute_version=None, target=None)

Either TensorCore support is provided in the compute capability or not

Parameters:

compute_version (str, optional) – compute capability of a GPU (e.g. “7.0”).
target (tvm.target.Target, optional) – The compilation target, will be used to determine arch if compute_version isn’t specified.

tvm.contrib.nvcc.have_cudagraph(): Either CUDA Graph support is provided

tvm.contrib.pickle_memoize

Memoize result of function via pickle, used for cache testcases.

class tvm.contrib.pickle_memoize.Cache(key, save_at_exit)

A cache object for result cache.

Parameters:

key (str) – The file key to the function
save_at_exit (bool) – Whether save the cache to file when the program exits

property cache: Return the cache, initializing on first use.

tvm.contrib.pickle_memoize.memoize(key, save_at_exit=False)

Memoize the result of function and reuse multiple times.

Parameters:

key (str) – The unique key to the file
save_at_exit (bool) – Whether save the cache to file when the program exits

Returns:

fmemoize – The decorator function to perform memoization.

Return type:

function

tvm.contrib.random

External function interface to random library.

tvm.contrib.random.randint(low, high, size, dtype='int32')

Return random integers from low (inclusive) to high (exclusive). Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high).

Parameters:

low (int) – Lowest (signed) integer to be drawn from the distribution
high (int) – One above the largest (signed) integer to be drawn from the distribution

Returns:

out – A tensor with specified size and dtype

Return type:

Tensor

tvm.contrib.random.uniform(low, high, size)

Draw samples from a uniform distribution.

Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.

Parameters:

low (float) – Lower boundary of the output interval. All values generated will be greater than or equal to low.
high (float) – Upper boundary of the output interval. All values generated will be less than high.
size (tuple of ints) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.

Returns:

out – A tensor with specified size and dtype.

Return type:

Tensor

tvm.contrib.random.normal(loc, scale, size)

Draw samples from a normal distribution.

Return random samples from a normal distribution.

Parameters:

loc (float) – loc of the distribution.
scale (float) – Standard deviation of the distribution.
size (tuple of ints) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.

Returns:

out – A tensor with specified size and dtype

Return type:

Tensor

tvm.contrib.rocblas

External function interface to rocBLAS libraries.

tvm.contrib.rocblas.matmul(lhs, rhs, transa=False, transb=False)

Create an extern op that compute matrix mult of A and rhs with rocBLAS

Parameters:

lhs (Tensor) – The left matrix operand
rhs (Tensor) – The right matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.rocblas.batch_matmul(lhs, rhs, transa=False, transb=False)

Create an extern op that compute matrix mult of A and rhs with rocBLAS

Parameters:

lhs (Tensor) – The left batched matrix operand
rhs (Tensor) – The right batched matrix operand
transa (bool) – Whether transpose lhs
transb (bool) – Whether transpose rhs

Returns:

C – The result tensor.

Return type:

Tensor

tvm.contrib.rocm

Utility for ROCm backend

tvm.contrib.rocm.find_lld(required=True)

Find ld.lld in system.

Parameters:: required (bool) – Whether it is required, runtime error will be raised if the compiler is required.
Returns:: valid_list – List of possible paths.
Return type:: list of str

Note

This function will first search ld.lld that matches the major llvm version that built with tvm

tvm.contrib.rocm.rocm_link(in_file, out_file, lld=None)

Link relocatable ELF object to shared ELF object using lld

Parameters:

in_file (str) – Input file name (relocatable ELF object file)
out_file (str) – Output file name (shared ELF object file)
lld (str, optional) – The lld linker, if not specified, we will try to guess the matched clang version.

tvm.contrib.rocm.parse_compute_version(compute_version)

Parse compute capability string to divide major and minor version

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “6.0”)

Returns:

major (int) – major version number
minor (int) – minor version number

tvm.contrib.rocm.have_matrixcore(compute_version=None)

Either MatrixCore support is provided in the compute capability or not

Parameters:: compute_version (str, optional) – compute capability of a GPU (e.g. “7.0”).
Returns:: have_matrixcore – True if MatrixCore support is provided, False otherwise
Return type:: bool

tvm.contrib.rocm.find_rocm_path()

Utility function to find ROCm path

Returns:: path – Path to ROCm root.
Return type:: str

tvm.contrib.spirv

Utility for Interacting with SPIRV Tools

tvm.contrib.spirv.optimize(spv_bin)

Optimize SPIRV using spirv-opt via CLI

Note that the spirv-opt is still experimental.

Parameters:: spv_bin (bytearray) – The spirv file
Returns:: cobj_bin – The HSA Code Object
Return type:: bytearray

tvm.contrib.tar

Util to invoke tarball in the system.

tvm.contrib.tar.tar(output, files)

Create tarball containing all files in root.

Parameters:

output (str) – The target shared library.
files (list) – List of files to be bundled.

tvm.contrib.tar.untar(tar_file, directory)

Unpack all tar files into the directory

Parameters:

tar_file (str) – The source tar file.
directory (str) – The target directory

tvm.contrib.tar.normalize_file_list_by_unpacking_tars(temp, file_list)

Normalize the file list by unpacking tars in list.

When a filename is a tar, it will untar it into an unique dir in temp and return the list of files in the tar. When a filename is a normal file, it will be simply added to the list.

This is useful to untar objects in tar and then turn them into a library.

Parameters:

temp (tvm.contrib.utils.TempDirectory) – A temp dir to hold the untared files.
file_list (List[str]) – List of path

Returns:

ret_list – An updated list of files

Return type:

List[str]

tvm.contrib.utils

Common system utilities

exception tvm.contrib.utils.DirectoryCreatedPastAtExit: Raised when a TempDirectory is created after the atexit hook runs.

class tvm.contrib.utils.TempDirectory(custom_path=None, keep_for_debug=None)

Helper object to manage temp directory during testing.

Automatically removes the directory when it went out of scope.

classmethod set_keep_for_debug(set_to=True): Keep temporary directories past program exit for debugging.

remove(): Remove the tmp dir

relpath(name)

Relative path in temp dir

Parameters:: name (str) – The name of the file.
Returns:: path – The concatenated path.
Return type:: str

listdir()

List contents in the dir.

Returns:: names – The content of directory
Return type:: list

tvm.contrib.utils.tempdir(custom_path=None, keep_for_debug=None)

Create temp dir which deletes the contents when exit.

Parameters:

custom_path (str, optional) – Manually specify the exact temp dir path
keep_for_debug (bool) – Keep temp directory for debugging purposes

Returns:

temp – The temp directory object

Return type:

TempDirectory

class tvm.contrib.utils.FileLock(path)

File lock object

Parameters:: path (str) – The path to the lock

release(): Release the lock

tvm.contrib.utils.filelock(path)

Create a file lock which locks on path

Parameters:: path (str) – The path to the lock
Returns:: lock
Return type:: File lock object

tvm.contrib.utils.is_source_path(path)

Check if path is source code path.

Parameters:: path (str) – A possible path
Returns:: valid – Whether path is a possible source path
Return type:: bool

tvm.contrib.utils.which(exec_name)

Try to find full path of exec_name

Parameters:: exec_name (str) – The executable name
Returns:: path – The full path of executable if found, otherwise returns None
Return type:: str

tvm.contrib.xcode

Utility to invoke Xcode compiler toolchain

tvm.contrib.xcode.xcrun(cmd)

Run xcrun and return the output.

Parameters:: cmd (list of str) – The command sequence.
Returns:: out – The output string.
Return type:: str

tvm.contrib.xcode.create_dylib(output, objects, arch, sdk='macosx', min_os_version=None)

Create dynamic library.

Parameters:

output (str) – The target shared library.
objects (list) – List of object files.
options (str) – The additional options.
arch (str) – Target major architectures
sdk (str) – The sdk to be used.

tvm.contrib.xcode.compile_metal(code, path_target=None, sdk='macosx', min_os_version=None)

Compile metal with CLI tool from env.

Parameters:

code (str) – The cuda code.
path_target (str, optional) – Output file.
sdk (str, optional) – The target platform SDK.

Returns:

metallib – The bytearray of the metallib

Return type:

bytearray

tvm.contrib.xcode.compile_coreml(model, model_name='main', out_dir='.'): Compile coreml model and return the compiled model path.