tvm.contrib
Contrib APIs of TVM python package.
Contrib API provides many useful not core features. Some of these are useful utilities to interact with thirdparty libraries and tools.
tvm.contrib.cblas
External function interface to BLAS libraries.
- tvm.contrib.cblas.matmul(lhs, rhs, transa=False, transb=False, **kwargs)
Create an extern op that compute matrix mult of A and rhs with CrhsLAS This function serves as an example on how to call external libraries.
- tvm.contrib.cblas.batch_matmul(lhs, rhs, transa=False, transb=False, iterative=False, **kwargs)
Create an extern op that compute batched matrix mult of A and rhs with CBLAS This function serves as an example on how to call external libraries.
tvm.contrib.clang
Util to invoke clang in the system.
- tvm.contrib.clang.find_clang(required=True)
Find clang in system.
- Parameters:
required (bool) – Whether it is required, runtime error will be raised if the compiler is required.
- Returns:
valid_list – List of possible paths.
- Return type:
Note
This function will first search clang that matches the major llvm version that built with tvm
- tvm.contrib.clang.create_llvm(inputs, output=None, options=None, cc=None)
Create llvm text ir.
- Parameters:
inputs (list of str) – List of input files name or code source.
output (str, optional) – Output file, if it is none a temporary file is created
options (list) – The list of additional options string.
cc (str, optional) – The clang compiler, if not specified, we will try to guess the matched clang version.
- Returns:
code – The generated llvm text IR.
- Return type:
tvm.contrib.cc
Util to invoke C/C++ compilers in the system.
- tvm.contrib.cc.get_cc()
Return the path to the default C/C++ compiler.
- Returns:
out – The path to the default C/C++ compiler, or None if none was found.
- Return type:
Optional[str]
Create shared library.
- Parameters:
output (str) – The target shared library.
objects (List[str]) – List of object files.
options (List[str]) – The list of additional options string.
cc (Optional[str]) – The compiler command.
cwd (Optional[str]) – The current working directory.
ccache_env (Optional[Dict[str, str]]) – The environment variable for ccache. Set None to disable ccache by default.
- tvm.contrib.cc.create_staticlib(output, inputs, ar=None)
Create static library.
- tvm.contrib.cc.create_executable(output, objects, options=None, cc=None, cwd=None, ccache_env=None)
Create executable binary.
- Parameters:
output (str) – The target executable.
objects (List[str]) – List of object files.
options (List[str]) – The list of additional options string.
cc (Optional[str]) – The compiler command.
cwd (Optional[str]) – The urrent working directory.
ccache_env (Optional[Dict[str, str]]) – The environment variable for ccache. Set None to disable ccache by default.
- tvm.contrib.cc.get_global_symbol_section_map(path, *, nm=None) dict[str, str]
Get global symbols from a library via nm -g
- tvm.contrib.cc.get_target_by_dump_machine(compiler)
Functor of get_target_triple that can get the target triple using compiler.
- Parameters:
compiler (Optional[str]) – The compiler.
- Returns:
out – A function that can get target triple according to dumpmachine option of compiler.
- Return type:
Callable
- tvm.contrib.cc.cross_compiler(compile_func, options=None, output_format=None, get_target_triple=None, add_files=None)
Create a cross compiler function by specializing compile_func with options.
This function can be used to construct compile functions that can be passed to AutoTVM measure or export_library.
- Parameters:
compile_func (Union[str, Callable[[str, str, Optional[str]], None]]) – Function that performs the actual compilation
options (Optional[List[str]]) – List of additional optional string.
output_format (Optional[str]) – Library output format.
get_target_triple (Optional[Callable]) – Function that can target triple according to dumpmachine option of compiler.
add_files (Optional[List[str]]) – List of paths to additional object, source, library files to pass as part of the compilation.
- Returns:
fcompile – A compilation function that can be passed to export_library.
- Return type:
Examples
from tvm.contrib import cc, ndk # export using arm gcc mod = build_runtime_module() mod.export_library(path_dso, fcompile=cc.cross_compiler("arm-linux-gnueabihf-gcc")) # specialize ndk compilation options. specialized_ndk = cc.cross_compiler( ndk.create_shared, ["--sysroot=/path/to/sysroot", "-shared", "-fPIC", "-lm"]) mod.export_library(path_dso, fcompile=specialized_ndk)
tvm.contrib.coreml_runtime
CoreML runtime that load and run coreml models.
- tvm.contrib.coreml_runtime.create(symbol, compiled_model_path, device)
Create a runtime executor module given a coreml model and context.
- Parameters:
- Returns:
coreml_runtime – Runtime coreml module that can be used to execute the coreml model.
- Return type:
- class tvm.contrib.coreml_runtime.CoreMLModule(module)
Wrapper runtime module.
This is a thin wrapper of the underlying TVM module. you can also directly call set_input, run, and get_output of underlying module functions
- Parameters:
module (Module) – The internal tvm module that holds the actual coreml functions.
- module
The internal tvm module that holds the actual coreml functions.
- Type:
Module
tvm.contrib.cublas
External function interface to cuBLAS libraries.
- tvm.contrib.cublas.matmul(lhs, rhs, transa=False, transb=False, dtype=None)
Create an extern op that compute matrix mult of A and rhs with cuBLAS
- tvm.contrib.cublas.batch_matmul(lhs, rhs, transa=False, transb=False, dtype=None)
Create an extern op that compute batch matrix mult of A and rhs with cuBLAS
tvm.contrib.cublaslt
External function interface to cuBLASlt libraries.
- tvm.contrib.cublaslt.matmul(lhs, rhs, transa=False, transb=False, n=0, m=0, dtype=None)
Create an extern op that compute matrix mult of A and rhs with cuBLAS
tvm.contrib.cudnn
External function interface to CuDNN v7 library.
- tvm.contrib.cudnn.exists()
Checks whether the local machine can use CuDNN.
- Returns:
exists – True if CuDNN support is enabled and a CuDNN-capable GPU exists. Otherwise, False.
- Return type:
- tvm.contrib.cudnn.algo_to_index(algo_type, algo_name)
Return a index represents the algorithm, which can be used in calling CuDNN function
- Parameters:
- Returns:
algo – Algorithm index
- Return type:
- tvm.contrib.cudnn.conv_output_shape(tensor_format, pad, stride, dilation, x_shape, w_shape, data_dtype, conv_dtype, groups=1)
Get output shape of 2D or 3D convolution
Paramters
- tensor_format: int
0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC 2: CUDNN_TENSOR_NCHW_VECT_C
- pad: int or list
padding
- stride: int or list
stride
- dilation: int or list
dilation
- x_shape: list
input shape
- w_shape: list
weight shape
- data_dtype: str
data type
- conv_dtype: str
convolution type
- groups: int
number of groups
- returns:
oshape – output shape
- rtype:
list
- tvm.contrib.cudnn.conv_dgrad_shape(tensor_format, pad, stride, dilation, dy_shape, w_shape, output_padding=(0, 0), groups=1)
Get output shape of conv2d gradient with respect to data
Paramters
- tensor_format: int
0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC
- pad: int or list
padding
- stride: int or list
stride
- dilation: int or list
dilation
- dy_shape: list
output gradient shape
- w_shape: list
weight shape
- data_dtype: str
data type
- conv_dtype: str
convolution type
- groups: int
number of groups
- returns:
oshape – output shape
- rtype:
list
- tvm.contrib.cudnn.conv_forward_find_algo(tensor_format, pad, stride, dilation, x_shape, w_shape, y_shape, data_dtype, conv_dtype, groups=1, verbose=True)
Choose the best forward algorithm for the given input.
Paramters
- tensor_format: int
0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC 2: CUDNN_TENSOR_NCHW_VECT_C
- pad: int or list
padding
- stride: int or list
stride
- dilation: int or list
dilation
- x_shape: list
input shape
- w_shape: list
weight shape
- y_shape: list
output shape
- data_dtype: str
data type
- conv_dtype: str
convolution type
- groups: int
number of groups
- returns:
algo – algo chosen by CUDNN
- rtype:
int
- tvm.contrib.cudnn.conv_backward_data_find_algo(tensor_format, pad, stride, dilation, dy_shape, w_shape, dx_shape, data_dtype, conv_dtype, groups=1, verbose=True)
Choose the best backward data algorithm for the given input.
Paramters
- tensor_format: int
0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC 2: CUDNN_TENSOR_NCHW_VECT_C
- pad: int or list
padding
- stride: int or list
stride
- dilation: int or list
dilation
- dy_shape: list
output gradient shape
- w_shape: list
weight shape
- dx_shape: list
dgrad shape
- data_dtype: str
data type
- conv_dtype: str
convolution type
- groups: int
number of groups
- verbose: bool
whether to show the selection trials
- returns:
algo – algo chosen by CUDNN
- rtype:
int
- tvm.contrib.cudnn.conv_backward_filter_find_algo(tensor_format, pad, stride, dilation, dy_shape, x_shape, dw_shape, data_dtype, conv_dtype, groups=1, verbose=True)
Choose the best backward filter algorithm for the given input.
Paramters
- tensor_format: int
0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC 2: CUDNN_TENSOR_NCHW_VECT_C
- pad: int or list
padding
- stride: int or list
stride
- dilation: int or list
dilation
- dy_shape: list
output gradient shape
- x_shape: list
weight shape
- dw_shape: list
wgrad shape
- data_dtype: str
data type
- conv_dtype: str
convolution type
- groups: int
number of groups
- verbose: bool
whether to show the selection trials
- returns:
algo – algo chosen by CUDNN
- rtype:
int
- tvm.contrib.cudnn.conv_forward(x, w, pad, stride, dilation, conv_mode, tensor_format, algo, conv_dtype, groups=1, verbose=True)
Create an extern op that compute 2D or 3D convolution with CuDNN
- Parameters:
x (Tensor) – input feature map
w (Tensor) – convolution weight
conv_mode (int) – 0: CUDNN_CONVOLUTION 1: CUDNN_CROSS_CORRELATION
tensor_format (int) – 0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC 2: CUDNN_TENSOR_NCHW_VECT_C
algo (int) – Forward algorithm, get index from
`algo_to_index`function if algo == -1, the best algo will be chosen by CUDNNconv_dtype (str) – convolution type
groups (int) – the number of groups
verbose (bool) – whether to show the selection trials
- Returns:
y – The result tensor
- Return type:
- tvm.contrib.cudnn.conv_backward_data(dy, w, pad, stride, dilation, conv_mode, tensor_format, conv_dtype, groups=1, output_padding=(0, 0))
Create a CuDNN extern op that computes the gradient of 2D convolution with respect to data.
- Parameters:
- Returns:
dx – dgrad tensor
- Return type:
- tvm.contrib.cudnn.conv_backward_filter(dy, x, kernel_size, pad, stride, dilation, conv_mode, tensor_format, conv_dtype, groups=1)
Create a CuDNN extern op that computes the gradient of 2D convolution with respect to weight.
- Parameters:
dy (Tensor) – output gradient
x (Tensor) – input tensor
kernel_size (a pair of int) – The spatial size of the corresponding forward convolution kernel
conv_mode (int) – 0: CUDNN_CONVOLUTION 1: CUDNN_CROSS_CORRELATION
tensor_format (int) – 0: CUDNN_TENSOR_NCHW 1: CUDNN_TENSOR_NHWC
conv_dtype (str) – convolution type
groups (int) – the number of groups
- Returns:
dw – wgrad tensor
- Return type:
- tvm.contrib.cudnn.softmax(x, axis=-1)
Compute softmax using CuDNN
- Parameters:
x (tvm.te.Tensor) – The input tensor
axis (int) – The axis to compute the softmax
- Returns:
ret – The result tensor
- Return type:
- tvm.contrib.cudnn.log_softmax(x, axis=-1)
Compute log_softmax using CuDNN
- Parameters:
x (tvm.te.Tensor) – The input tensor
axis (int) – The axis to compute log softmax over
- Returns:
ret – The result tensor
- Return type:
tvm.contrib.dlpack
Wrapping functions to bridge frameworks with DLPack support to TVM
- tvm.contrib.dlpack.convert_func(tvm_func, tensor_type, to_dlpack_func)
- Convert a tvm function into one that accepts a tensor from another
framework, provided the other framework supports DLPACK
tvm.contrib.dnnl
External function interface to BLAS libraries.
- tvm.contrib.dnnl.matmul(lhs, rhs, transa=False, transb=False, **kwargs)
Create an extern op that compute matrix mult of A and rhs with CrhsLAS This function serves as an example on how to call external libraries.
- tvm.contrib.dnnl.dnnl_conv2d(src, weights, stride, padding, dilation, groups, channel_last=False, out_dtype='float32', **kwargs)
Convolution operator in NCHW layout.
- Parameters:
src (tvm.te.Tensor) – 4-D with shape [batch, in_channel, in_height, in_width]
weights (tvm.te.Tensor) – 4-D with shape [num_filter, in_channel, filter_height, filter_width]
stride (int or a list/tuple of two ints) – Stride size, or [stride_height, stride_width]
padding (int or a list/tuple of 2 or 4 ints) – padding size, or [pad_height, pad_width] for 2 ints, or [pad_top, pad_left, pad_bottom, pad_right] for 4 ints
dilation (int or a list/tuple of two ints) – dilation size, or [dilation_height, dilation_width]
groups (str) – input data layout: NCHW or NHWC
channel_last (bool) – chose if input/output data format is in channel_last format(NHWC) or in plain format(NCHW)
out_dtype (str) – output datatype: now only support float32
- Returns:
Output – 4-D with shape [batch, out_channel, out_height, out_width]
- Return type:
tvm.contrib.download
Helper utility for downloading
- tvm.contrib.download.download(url, path, overwrite=False, size_compare=False, retries=3)
Downloads the file from the internet. Set the input options correctly to overwrite or do the size comparison
- Parameters:
url (str) – Download url.
path (str) – Local file path to save downloaded file.
overwrite (bool, optional) – Whether to overwrite existing file, defaults to False.
size_compare (bool, optional) – Whether to do size compare to check downloaded file, defaults to False
retries (int, optional) – Number of time to retry download, defaults to 3.
- tvm.contrib.download.download_testdata(url, relpath, module=None, overwrite=False)
Downloads the test data from the internet.
- Parameters:
url (str) – Download url.
relpath (str) – Relative file path.
module (Union[str, list, tuple], optional) – Subdirectory paths under test data folder.
overwrite (bool, defaults to False) – If True, will download a fresh copy of the file regardless of the cache. If False, will only download the file if a cached version is missing.
- Returns:
abspath – Absolute file path of downloaded file
- Return type:
tvm.contrib.emcc
Util to invoke emscripten compilers in the system.
- tvm.contrib.emcc.create_tvmjs_wasm(output, objects, options=None, cc='emcc', libs=None)
Create wasm that is supposed to run with the tvmjs.
tvm.contrib.hipblas
External function interface to hipBLAS libraries.
- tvm.contrib.hipblas.matmul(lhs, rhs, transa=False, transb=False, dtype=None)
Create an extern op that compute matrix mult of A and rhs with cuBLAS
- tvm.contrib.hipblas.batch_matmul(lhs, rhs, transa=False, transb=False, dtype=None)
Create an extern op that compute batch matrix mult of A and rhs with cuBLAS
tvm.contrib.mkl
External function interface to BLAS libraries.
- tvm.contrib.mkl.matmul(lhs, rhs, transa=False, transb=False, **kwargs)
Create an extern op that compute matrix mult of A and rhs with CrhsLAS This function serves as an example on how to call external libraries.
- tvm.contrib.mkl.matmul_u8s8s32(lhs, rhs, transa=False, transb=False, **kwargs)
Create an extern op that compute matrix mult of A and rhs with CrhsLAS This function serves as an example on how to call external libraries.
- tvm.contrib.mkl.batch_matmul(lhs, rhs, transa=False, transb=False, iterative=False, **kwargs)
Create an extern op that compute batched matrix mult of A and rhs with mkl This function serves as an example on how to call external libraries.
tvm.contrib.ndk
Util to invoke NDK compiler toolchain.
Create shared library.
- tvm.contrib.ndk.create_staticlib(output, inputs)
Create static library:
tvm.contrib.nnpack
External function interface to NNPACK libraries.
- tvm.contrib.nnpack.is_available()
Check whether NNPACK is available, that is, nnp_initialize() returns nnp_status_success.
- tvm.contrib.nnpack.fully_connected_inference(lhs, rhs, nthreads=1)
Create an extern op that compute fully connected of 1D tensor lhs and 2D tensor rhs with nnpack.
- tvm.contrib.nnpack.convolution_inference(data, kernel, bias, padding, stride, nthreads=1, algorithm=0)
Create an extern op to do inference convolution of 4D tensor data and 4D tensor kernel and 1D tensor bias with nnpack.
- Parameters:
data (Tensor) – data 4D tensor input[batch][input_channels][input_height][input_width] of FP32 elements.
kernel (Tensor) – kernel 4D tensor kernel[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
bias (Tensor) – bias 1D array bias[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
padding (list) – padding A 4-dim list of [pad_top, pad_bottom, pad_left, pad_right], which indicates the padding around the feature map.
stride (list) – stride A 2-dim list of [stride_height, stride_width], which indicates the stride.
- Returns:
output – output 4D tensor output[batch][output_channels][output_height][output_width] of FP32 elements.
- Return type:
- tvm.contrib.nnpack.convolution_inference_without_weight_transform(data, transformed_kernel, bias, padding, stride, nthreads=1, algorithm=0)
Create an extern op to do inference convolution of 4D tensor data and 4D pre-transformed tensor kernel and 1D tensor bias with nnpack.
- Parameters:
data (Tensor) – data 4D tensor input[batch][input_channels][input_height][input_width] of FP32 elements.
transformed_kernel (Tensor) – transformed_kernel 4D tensor kernel[output_channels][input_channels][tile] [tile] of FP32 elements.
bias (Tensor) – bias 1D array bias[output_channels][input_channels][kernel_height] [kernel_width] of FP32 elements.
padding (list) – padding A 4-dim list of [pad_top, pad_bottom, pad_left, pad_right], which indicates the padding around the feature map.
stride (list) – stride A 2-dim list of [stride_height, stride_width], which indicates the stride.
- Returns:
output – output 4D tensor output[batch][output_channels][output_height][output_width] of FP32 elements.
- Return type:
- tvm.contrib.nnpack.convolution_inference_weight_transform(kernel, nthreads=1, algorithm=0, dtype='float32')
Create an extern op to do inference convolution of 3D tensor data and 4D tensor kernel and 1D tensor bias with nnpack.
tvm.contrib.nvcc
Utility to invoke nvcc compiler in the system
- tvm.contrib.nvcc.compile_cuda(code, target_format=None, arch=None, options=None, path_target=None, compiler='nvcc')
Compile CUDA code with NVCC or NVRTC.
- Parameters:
code (str) – The CUDA code.
target_format (str) – The target format of the compiler (“ptx”, “cubin”, or “fatbin”).
arch (str) – The CUDA architecture.
path_target (str, optional) – Output file.
compiler (str, optional) – Compiler backend: “nvcc” or “nvrtc”. This can be set by the TVM_CUDA_COMPILE_MODE environment variable.
- Returns:
res_binary – The bytearray of the compiled binary (ptx/cubin/fatbin).
- Return type:
Notes
NVRTC is a “runtime” compilation library and can be faster for JIT compilation.
NVRTC requires cuda-python: pip install cuda-python
- tvm.contrib.nvcc.find_cuda_path()
Utility function to find CUDA path
- Returns:
path – Path to CUDA root.
- Return type:
- tvm.contrib.nvcc.get_cuda_version(cuda_path=None)
Utility function to get CUDA version
- tvm.contrib.nvcc.find_nvshmem_paths() tuple[str, str]
Searches for the NVSHMEM include and library directories.
- Return type:
A tuple containing the path to the include directory and the library directory.
- tvm.contrib.nvcc.parse_compute_version(compute_version)
Parse compute capability string to divide major and minor version
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.0”)
- Returns:
major (int) – major version number
minor (int) – minor version number
- tvm.contrib.nvcc.have_fp16(compute_version)
Either fp16 support is provided in the compute capability or not
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.0”)
- tvm.contrib.nvcc.have_int8(compute_version)
Either int8 support is provided in the compute capability or not
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.1”)
- tvm.contrib.nvcc.have_tensorcore(compute_version=None, target=None)
Either TensorCore support is provided in the compute capability or not
- Parameters:
compute_version (str, optional) – compute capability of a GPU (e.g. “7.0”).
target (tvm.target.Target, optional) – The compilation target, will be used to determine arch if compute_version isn’t specified.
- tvm.contrib.nvcc.have_cudagraph()
Either CUDA Graph support is provided
tvm.contrib.pickle_memoize
Memoize result of function via pickle, used for cache testcases.
- class tvm.contrib.pickle_memoize.Cache(key, save_at_exit)
A cache object for result cache.
- Parameters:
- property cache
Return the cache, initializing on first use.
- tvm.contrib.pickle_memoize.memoize(key, save_at_exit=False)
Memoize the result of function and reuse multiple times.
tvm.contrib.popen_pool
Multiprocessing via Popen.
This module provides a multi-processing pool backed by Popen. with additional timeout support.
- tvm.contrib.popen_pool.kill_child_processes(pid)
Kill all child processes recursively for a given pid.
- Parameters:
pid (int) – The given parameter id.
- class tvm.contrib.popen_pool.StatusKind(value)
Running and return value status.
- class tvm.contrib.popen_pool.MapResult(status, value)
Result of map_with_error_catching.
- Parameters:
status (StatusKind) – The status of the result.
value (Any) – The result value.
- class tvm.contrib.popen_pool.PopenWorker(initializer=None, initargs=(), maximum_uses=None, stdout=None, stderr=None)
A subprocess worker via Popen.
PopenWorker provides a low-level API to interact with a separate process via Popen.
- Parameters:
initializer (callable or None) – A callable initializer, or None
initargs (Tuple[object]) – A tuple of args for the initializer
maximum_uses (Optional[int]) – The maximum number of times a process can be used before being recycled, i.e. killed and restarted. If None, the process will be reused until an operation times out.
stdout (Union[None, int, IO[Any]]) – The standard output streams handler specified for the popen process.
stderr (Union[None, int, IO[Any]]) – The standard error streams handler specified for the popen process.
- kill()
Kill the current running process and cleanup.
Note
The worker can start a new process when send is called again.
- join(timeout=None)
Join the current process worker before it terminates.
- Parameters:
timeout (Optional[number]) – Timeout value, block at most timeout seconds if it is a positive number.
- is_alive()
Check if the process is alive
- send(fn, args=(), kwargs=None, timeout=None)
Send a new function task
fn(*args, **kwargs)to the subprocess.- Parameters:
Note
The caller must call recv before calling the next send in order to make sure the timeout and child process exit won’t affect the later requests.
- recv()
Receive the result of the last send.
- Returns:
result – The result of the last send.
- Return type:
- Raises:
ChildProcessError – if the child process exited abnormally.:
TimeoutError – if timeout happens:
Exception – if other exception happens during the execution.:
- class tvm.contrib.popen_pool.PopenPoolExecutor(max_workers=None, timeout=None, initializer=None, initargs=(), maximum_process_uses=None, stdout=None, stderr=None)
An parallel executor backed by Popen processes.
- Parameters:
max_worker (int) – Maximum number of workers
timeout (float) – Timeout value for each function submit.
initializer (callable or None) – A callable initializer, or None
initargs (Tuple[object]) – A tuple of args for the initializer
maximum_process_uses (Optional[int]) – The maximum number of times each process can be used before being recycled, i.e. killed and restarted. If None, processes will be reused until an operation times out.
stdout (Union[None, int, IO[Any]]) – The standard output streams handler specified for the workers in the pool.
stderr (Union[None, int, IO[Any]]) – The standard error streams handler specified for the workers in the pool.
Note
If max_workers is NONE then the number returned by os.cpu_count() is used. This method aligns with the behavior of multiprocessing.pool().
- shutdown(wait=True)
Shutdown the executor and clean up resources.
- Parameters:
wait (bool) – If True, wait for pending work to complete.
Note
DEADLOCK WARNING: This method can deadlock when called during garbage collection due to exception reference cycles. When exceptions occur, Python creates reference cycles that delay garbage collection. The deadlock happens when: exception creates reference cycle → new pool creates worker → GC cleans old pool → old pool’s __del__ calls shutdown() which tries to acquire locks again.
- submit(fn, *args, **kwargs) Future
Submit a new function job to the pool
- Parameters:
- Returns:
future – A future that can be used to access the result.
- Return type:
tvm.contrib.random
External function interface to random library.
- tvm.contrib.random.randint(low, high, size, dtype='int32')
Return random integers from low (inclusive) to high (exclusive). Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high).
- tvm.contrib.random.uniform(low, high, size)
Draw samples from a uniform distribution.
Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.
- Parameters:
low (float) – Lower boundary of the output interval. All values generated will be greater than or equal to low.
high (float) – Upper boundary of the output interval. All values generated will be less than high.
size (tuple of ints) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.
- Returns:
out – A tensor with specified size and dtype.
- Return type:
- tvm.contrib.random.normal(loc, scale, size)
Draw samples from a normal distribution.
Return random samples from a normal distribution.
tvm.contrib.rocm
Utility for ROCm backend
- tvm.contrib.rocm.find_lld(required=True)
Find ld.lld in system.
- Parameters:
required (bool) – Whether it is required, runtime error will be raised if the compiler is required.
- Returns:
valid_list – List of possible paths.
- Return type:
Note
This function will first search ld.lld that matches the major llvm version that built with tvm
- tvm.contrib.rocm.rocm_link(in_file, out_file, lld=None)
Link relocatable ELF object to shared ELF object using lld
- tvm.contrib.rocm.parse_compute_version(compute_version)
Parse compute capability string to divide major and minor version
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.0”)
- Returns:
major (int) – major version number
minor (int) – minor version number
- tvm.contrib.rocm.have_matrixcore(compute_version=None)
Either MatrixCore support is provided in the compute capability or not
tvm.contrib.spirv
Utility for Interacting with SPIRV Tools
tvm.contrib.tar
Util to invoke tarball in the system.
- tvm.contrib.tar.tar(output, files)
Create tarball containing all files in root.
- tvm.contrib.tar.untar(tar_file, directory)
Unpack all tar files into the directory
- tvm.contrib.tar.normalize_file_list_by_unpacking_tars(temp, file_list)
Normalize the file list by unpacking tars in list.
When a filename is a tar, it will untar it into an unique dir in temp and return the list of files in the tar. When a filename is a normal file, it will be simply added to the list.
This is useful to untar objects in tar and then turn them into a library.
- Parameters:
temp (tvm.contrib.utils.TempDirectory) – A temp dir to hold the untared files.
file_list (List[str]) – List of path
- Returns:
ret_list – An updated list of files
- Return type:
List[str]
tvm.contrib.thrust
Utilities for thrust
tvm.contrib.tvmjs
Namespace to store utilities for building web runtime.
- class tvm.contrib.tvmjs.TensorCacheShardingManager(cache_dir: str, prefix: str, shard_cap_nbytes: int, initial_shard_records: Mapping[str, Any] | None = None)
Internal helper to shard ndarrays.
- append_or_update(data, name, shape, dtype, encode_format, allow_update: bool = False)
Commit a record to the manager.
- Parameters:
- update_single_record(rec, data)
Update a single record in a shard file.
- commit()
Commit a record
- finish()
Finish building and return shard records.
- property pending_nbytes
Return total bytes stored so far
- tvm.contrib.tvmjs.dump_tensor_cache(params: Mapping[str, ndarray | Tensor] | Iterator[tuple[str, ndarray | Tensor]], cache_dir: str, encode_format='f32-to-bf16', meta_data=None, shard_cap_mb=32, show_progress: bool = True, update_if_exists: bool = False)
Dump parameters to Tensor cache.
- Parameters:
params (Union[) – Mapping[str, Union[np.ndarray, tvm.runtime.Tensor]], Iterator[Tuple[str, Union[np.ndarray, tvm.runtime.Tensor]]],
] – The parameter dictionary or generator
cache_dir (str) – The path to the cache
encode_format ({"f32-to-bf16", "raw"}) – Encoding format.
meta_data (json-compatible-struct or Callable[[], Any]) – Extra meta_data to be stored in the cache json file, or a callable that returns the metadata.
shard_cap_mb (int) – Maxinum number of MB to be kept per shard
show_progress (bool) – A boolean indicating if to show the dump progress.
update_if_exists (bool) – If the cache already exists, update the cache. When set to False, it will overwrite the existing files.
tvm.contrib.utils
Common system utilities
- exception tvm.contrib.utils.DirectoryCreatedPastAtExit
Raised when a TempDirectory is created after the atexit hook runs.
- class tvm.contrib.utils.TempDirectory(custom_path=None, keep_for_debug=None)
Helper object to manage temp directory during testing.
Automatically removes the directory when it went out of scope.
- classmethod set_keep_for_debug(set_to=True)
Keep temporary directories past program exit for debugging.
- remove()
Remove the tmp dir
- relpath(name)
Relative path in temp dir
- tvm.contrib.utils.tempdir(custom_path=None, keep_for_debug=None)
Create temp dir which deletes the contents when exit.
- Parameters:
- Returns:
temp – The temp directory object
- Return type:
- class tvm.contrib.utils.FileLock(path)
File lock object
- Parameters:
path (str) – The path to the lock
- release()
Release the lock
- tvm.contrib.utils.filelock(path)
Create a file lock which locks on path
- Parameters:
path (str) – The path to the lock
- Returns:
lock
- Return type:
File lock object
- tvm.contrib.utils.is_source_path(path)
Check if path is source code path.
tvm.contrib.xcode
Utility to invoke Xcode compiler toolchain
- tvm.contrib.xcode.xcrun(cmd)
Run xcrun and return the output.
- tvm.contrib.xcode.create_dylib(output, objects, arch, sdk='macosx', min_os_version=None)
Create dynamic library.
- tvm.contrib.xcode.compile_metal(code, path_target=None, sdk='macosx', min_os_version=None)
Compile Metal with CLI tool from env.
- tvm.contrib.xcode.compile_coreml(model, model_name='main', out_dir='.')
Compile coreml model and return the compiled model path.
tvm.contrib.cutlass
BYOC support for CUTLASS.
- tvm.contrib.cutlass.has_cutlass()
Returns true if the CUTLASS custom codegen is available
- tvm.contrib.cutlass.finalize_modules(lib, lib_path='compile.so', tmp_dir='./tmp')
Returns lib with any C source, LLVM and static library modules complied and linked in ready for use by the graph or AOT executors. This method is not specific to CUTLASS, however it does assume nvcc will be used for final compilation and linking. It is provided here for convenience.
- Parameters:
lib (runtime.Module) – The output from build.
lib_path (string) – The path to a shared library which will be generated as the result of the build process.
tmp_dir (string) – A temporary directory where intermediate compiled artifacts will be stored.
- Returns:
updated_lib – The updated library with all compilation and linking completed.
- Return type:
runtime.Module
tvm.contrib.hexagon
Hexagon APIs.
- class tvm.contrib.hexagon.ContainerSession(base_image_name: str = '')
Docker container session
- Parameters:
base_image_name (str) – Docker image name to use. Empty string means to use default “tlcpack/ci-hexagon” base image.
- close()
Close docker container session
- tvm.contrib.hexagon.allocate_hexagon_array(dev, tensor_shape=None, dtype=None, data=None, axis_separators=None, mem_scope=None)
Allocate a hexagon array which could be a 2D array on physical memory defined by axis_separators
Export Hexagon AOT module.
Create shared library.
- Parameters:
output (str) – The target shared library.
objects (List[str]) – List of object files.
options (List[str]) – The list of additional options string.
cc (Optional[str]) – The compiler command.
cwd (Optional[str]) – The current working directory.
ccache_env (Optional[Dict[str, str]]) – The environment variable for ccache. Set None to disable ccache by default.
- tvm.contrib.hexagon.export_module(module, out_dir, binary_name='test_binary.so')
Export Hexagon shared object to a file.
Link Hexagon shared library using docker container with proper tooling.
- tvm.contrib.hexagon.pack_imports(module: Module, is_system_lib: bool, c_symbol_prefix: str, workspace_dir: str)
Create an ELF object file that contains the binary data for the modules imported in module. This is a callback function for use as fpack_imports in export_library.
- Parameters:
- Returns:
file_name – The name of the created object file.
- Return type:
- tvm.contrib.hexagon.register_global_func(func_name: str | Callable[[...], Any], f: Callable[[...], Any] | None = None, override: bool = False) Any
Register global function.
- Parameters:
func_name – The function name
f – The function to be registered.
override – Whether override existing entry.
- Returns:
Register function if f is not specified.
- Return type:
fregister
Examples
import tvm_ffi # we can use decorator to register a function @tvm_ffi.register_global_func("mytest.echo") def echo(x): return x # After registering, we can get the function by its name f = tvm_ffi.get_global_func("mytest.echo") assert f(1) == 1 # we can also directly register a function tvm_ffi.register_global_func("mytest.add_one", lambda x: x + 1) f = tvm_ffi.get_global_func("mytest.add_one") assert f(1) == 2
See also
tvm_ffi.get_global_func(),tvm_ffi.remove_global_func()
- tvm.contrib.hexagon.register_linker(f)
Register a function that will return the path to the Hexagon linker.