tvm.relax.frontend

Frontends for constructing Relax programs, with the model importers

tvm.relax.frontend.detach_params(mod: tvm.ir.module.IRModule) Tuple[tvm.ir.module.IRModule, Dict[str, List[tvm.runtime.ndarray.NDArray]]]

Detach the attribute “params” in the functions of the input IRModule as separate dictionary of params.

Parameters

mod (tvm.IRModule) – The IRModule whose functions’ “param” attribute is going to be detached.

Returns

  • detached_mod (tvm.IRModule) – The IRModule after the detachment.

  • params_dict (Dict[str, List[tvm.nd.NDArray]]) – The detached params. The dict keys corresponds to the names of the functions in the input IRModule that have attribute “params”.

tvm.relax.frontend.nn

A PyTorch-like API to build IRModules.

class tvm.relax.frontend.nn.Effect

Effect is a special non-user facing type that is used to represent operations with side effects, for example, print. It is used to represent the output of a computation.

emit_init(name_hint: str, builder: tvm.relax.block_builder.BlockBuilder) List[tvm.relax.expr.DataflowVar]

Emit the initialization of the effect. This method is called by the compiler to initialize the effect.

create(name_hint: str) List[tvm.relax.expr.Var]

Create the implicit inputs to a relax.Function that represents the side effect

set_state(state_vars: List[tvm.relax.expr.Var]) None

Set the variables that represents the effect

finalize() List[tvm.relax.expr.Var]

finalize the effect as the implicit return value of a relax.Function

to(dtype: Optional[str] = None) None

Convert the effect to specific dtype. Usually it is no-op for most of the effects

class tvm.relax.frontend.nn.Module

Base class for neural network components. Subclass it to build your models. Modules can nest within each other in a tree structure using regular attribute assignment.

named_parameters(prefix: str = '') Iterator[Tuple[str, tvm.relax.frontend.nn.core.Parameter]]

This method provides an iterator over module parameters, yielding both the parameter name and its corresponding value.

Parameters

prefix (str) – Prefix to prepend to all parameter names.

Yields

(str, Parameter) - Tuple containing the name and parameter

parameters() Iterator[tvm.relax.frontend.nn.core.Parameter]

This method provides an iterator over module parameters, yielding only the Parameter value.

Yields

Parameter - The module’s parameter

state_dict(*, prefix: str = '', destination: Optional[Dict[str, tvm.relax.frontend.nn.core.Parameter]] = None) Dict[str, tvm.relax.frontend.nn.core.Parameter]

Returns a dictionary containing references to the whole state of the module.

Parameters
  • prefix (str) – Prefix to prepend to all parameter names.

  • destination (Optional[Dict[str, Parameter]]) – Dictionary to which state will be saved. If None, a new dictionary is created.

Returns

dict – a dictionary containing a whole state of the module

Return type

Dict[str, Parameter]

load_state_dict(state_dict: Dict[str, tvm.relax.frontend.nn.core.Parameter], strict: bool = True) Tuple[List[str], List[str]]

This function copies parameters and buffers from the state_dict into the current module and its descendants. If strict is set to True, the keys in the state_dict must exactly match the keys returned by the state_dict() function of this module.

Parameters
  • state_dict (Dict[str, Parameter]) – A dictionary containing a whole state of the module

  • strict (bool = True) – Whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function.

Returns

(missing_keys, unexpected_keys) – A tuple of two lists: the missing keys and the unexpected keys.

Return type

Tuple[List[str], List[str]]

to(dtype: Optional[str] = None) None

Convert the module to specific dtype recursively

export_tvm(spec: _spec.ModuleSpecType, debug: bool = False, allow_extern: bool = False) Union[Tuple[tvm.ir.module.IRModule, List[Tuple[str, tvm.relax.frontend.nn.core.Parameter]]], Tuple[tvm.ir.module.IRModule, List[Tuple[str, tvm.relax.frontend.nn.core.Parameter]], List[ExternModule]]]

Export the module to TVM IRModule and parameters

Parameters
  • spec (_spec.ModuleSpecType) – A dictionary mapping each input name to a specification that defines the inputs shape and dtype.

  • debug (bool) – If set to True, then the exported module will support effects. This enables things like printing in the graph.

Returns

  • irmodule (tvm.ir.IRModule) – The converted tvm IR representation of the model.

  • params (List[Tuple[str, Parameter]]) – A list of Parameters corresponding to the weights of the model.

  • ext_mods (List[nn.ExternModule]) – A list of ExternModules that are used in the model.

jit(spec: _spec.ModuleSpec, device: Union[str, tvm._ffi.runtime_ctypes.Device] = 'cpu', pipeline: Union[None, str, tvm.ir.transform.Pass] = 'default_build', out_format: str = 'torch', debug: bool = False) Any

Just-in-time compilation of a nn.model to an executable

class tvm.relax.frontend.nn.ModuleList(modules: List[tvm.relax.frontend.nn.core.Module])

Holds submodules in a list.

append(module: tvm.relax.frontend.nn.core.Module)

Add a module to the end of the ModuleList

to(dtype: Optional[str] = None) None

Convert the module to specific dtype recursively

forward(x)

Feed-forward pass of the module

class tvm.relax.frontend.nn.Object(*, _expr: tvm.ir.expr.RelayExpr, _name: str)

A wrapper on top of relax.Expr whose struct_info is the base ObjectStructInfo (rather than any its subclass). Object effectively represents non-tensor frontend components such as KV caches.

class tvm.relax.frontend.nn.Parameter(shape: Sequence[Union[int, str, tvm.ir.expr.PrimExpr]], dtype: Optional[str] = None)

A parameter represents the weight of a neural network layer. It is a special tensor which could be bound or not bound to concrete values. If a parameter is bound to a concrete value, it is called a bound parameter, otherwise it is called an unbound parameter.

property data: Optional[tvm.runtime.ndarray.NDArray]

Returns the concrete value of the parameter if it is bound to a concrete value, otherwise returns None. The returned value is a tvm.runtime.NDArray.

to(dtype: Optional[str] = None) None

Change the dtype of the parameter if it is not bound to any concrete data

class tvm.relax.frontend.nn.Tensor(*, _expr: tvm.ir.expr.RelayExpr)

A wrapper on top of relax.Expr whose struct_info is a TensorStructInfo, providing more convenient access shape and dtype information. Tensor is always symbolc and not bound to any concrete values. Shape and dtype inference is done eagerly upon tensor creation, i.e. when operators are applied on tensors, the shape and dtype information is already available.

static from_const(data) tvm.relax.frontend.nn.core.Tensor

Construct a tensor from numpy constants.

static from_scalar(data: Union[int, float], dtype: str) tvm.relax.frontend.nn.core.Tensor

Construct a tensor from a scalar with dtype specified.

static from_struct_info(struct_info: tvm.relax.struct_info.TensorStructInfo, name: str = 'tensor') tvm.relax.frontend.nn.core.Tensor

Construct a nn.Tensor from relax TensorStructInfo

static placeholder(shape: Sequence[Union[int, str, tvm.ir.expr.PrimExpr]], dtype: str, name: str = 'tensor') tvm.relax.frontend.nn.core.Tensor

Create a placeholder tensor with given shape and dtype. A placeholder tensor should never be created directly by users in usual cases, and the only exception is to indicate the shape/dtype of return values of an external function.

If shape is a string name, we create a symbolic shape tvm.tir.Var(name, “int64”).

property shape: List[Union[int, tvm.ir.expr.PrimExpr]]

Returns the shape of the tensor as a list of integers.

An integer can be a python int or tvm.tir.PrimExpr, depending on whether the shape is fully static, for example, [1, 2, tvm.tir.Var(“n”)] is a valid shape where the last dimension is dynamic while the first two dimensions are always static constants.

Returns

shape – The shape of the tensor

Return type

List[Union[int, tir.PrimExpr]]

property ndim: int

Returns the number of dimensions of the tensor.

Returns

ndim – The number of dimensions of the tensor

Return type

int

property dtype: str

Returns the data type of the tensor.

Returns

dtype – The data type of the tensor

Return type

str

tvm.relax.frontend.nn.add_extern(mod: tvm.relax.frontend.nn.extern.ExternModule) None

Add an external module to the exporter.

class tvm.relax.frontend.nn.ExternModule(symbols: Dict[str, Callable])

The abstract base class for external modules. External modules are designed to help incorporate user-provided handcrafted kernels into the exported TVM IRModule.

load() tvm.runtime.module.Module

Loads the external module into a TVM runtime module.

class tvm.relax.frontend.nn.ObjectModule(symbols: Dict[str, Callable], filepath: pathlib.Path)

A subclass of nn.ExternModule, which allows users to provide an object .o file to be linked into compiled artifact;

load() tvm.runtime.module.Module

Loads the external module into a TVM runtime module.

class tvm.relax.frontend.nn.SourceModule(symbols: Dict[str, Callable], source_code: Union[str, pathlib.Path], source_format: str, compile_options: Optional[List[str]] = None, compiler: Optional[str] = None, output_format: str = 'obj')

A subclass of nn.ExternModule. It compiles C++/CUDA source code and link them into the eventual IRModule.

Shape/dtype inference. The nn.ExternModule system requires users to provide additional information to work, namely, symbols. It is a dictionary that maps each symbol in the external object file to its shape/dtype inference function. Consider a case where function my_func accepts two tensors, a of shape (x, y, 1), and b of shape (y, z, 5), and produces a tensor c of shape (x, y, z, 9), the shape/dtype inference function should look like:

def shape_dtype_inference(a, b):
    x, y, _ = a.shape
    _, z, _ = b.shape
    return nn.Tensor.placeholder((x, y, z, 9), dtype="float32")

and the symbols dictionary should be provided as:

symbols={
    "my_func": shape_dtype_inference,
}

Calling convention. All external modules now follows “destination-passing-style” (DPS) calling convention, which means the returned tensors are pre-allocated by the system already and passed in as an argument of the external function.

Reuse the example above, the implementation of my_func should include three parameters in its signature, where tensors are represented using DLTensor from DLPack, the de facto standard of in-memory representation of tensors. More details: https://github.com/dmlc/dlpack/blob/v0.8/include/dlpack/dlpack.h#L163-L206.

To expose the symbol, TVM_DLL_EXPORT_TYPED_FUNC(symbol, function) is guaranteed available:

// those headers are guaranteed to be available
#include <dlpack/dlpack.h>
#include <tvm/runtime/data_type.h>
#include <tvm/runtime/packed_func.h>

namespace {
// anonymous namespace hides the symbol `_my_func_impl` from other translation units
int _my_func_impl(DLTensor* a, DLTensor* b, DLTensor* c) {
    // `a` and `b` are inputs, and `c` is the output
}
}
// expose symbol `my_func` instead of `_my_func_impl`
TVM_DLL_EXPORT_TYPED_FUNC(my_func, _my_func_impl);

A compiler pass `AttachExternModules`. It is introduced to attach a list of nn.ExternModule`s into an IRModule at any stage of the compilation pipeline, and attach the compiled external modules as `runtime.Module`s into IRModule’s `external_mods attribute. It is required by linking in relax.build, but with the existence of this pass, source compilation can be deferred to arbitrary stage of TVM compilation.

Caveats. It is required to call nn.add_extern to register external modules exactly once during export_tvm. Each symbol should be registered exactly once to avoid potential conflicts, and otherwise an error will be raised.

static tvm_home() pathlib.Path

Find TVM’s home directory. If TVM_HOME environment variable is set, use it. Otherwise, use the directory where the tvm Python package is installed. As a sanity check, it is required to have include and 3rdparty as direct subdirectories.

Returns

tvm_home – The TVM home directory, and it is guaranteed to have include and 3rdparty as direct subdirectories.

Return type

pathlib.Path

static get_includes(tvm_pkg: Optional[List[str]] = None) List[pathlib.Path]

Returns the default include paths according to tvm_home(). By default, it includes TVM, DLPack, and DMLC-Core. With tvm_pkg provided, it also includes the specified package under tvm_home/3rdparty.

Parameters

tvm_pkg (Optional[List[str]]) – The list of packages to be included under tvm_home/3rdparty. Each element should be a relative path to tvm_home/3rdparty.

Returns

includes – The list of include paths.

Return type

List[pathlib.Path]

static get_compile_options(source_format: str, tvm_pkg: Optional[List[str]] = None) List[str]

Returns the default compile options depending on source_format, including the default inlcude paths w.r.t. tvm_home(), default flags to configure DMLC-Core, and by default, it uses “-O3” and “-std=c++17”.

Parameters
  • source_format (str) – The source code format. It can be either “cpp” or “cu”.

  • tvm_pkg (Optional[List[str]]) – The list of packages to be included under tvm_home/3rdparty. Each element should be a relative path to tvm_home/3rdparty.

Returns

compile_options – The list of compilation flags.

Return type

List[str]

compile(output_path: pathlib.Path) None

Compiles the source code in a provided directory and returns the compiled artifact.

load() tvm.runtime.module.Module

Loads the external module into a TVM runtime module.

class tvm.relax.frontend.nn.GELU

relax.frontend.nn.Module for GELU activation layer.

class tvm.relax.frontend.nn.Conv1D(in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, padding: int = 0, dilation: int = 1, groups: int = 1, bias: bool = True, dtype: Optional[str] = None)

relax.frontend.nn.Module for conv1d layer.

forward(x: tvm.relax.frontend.nn.core.Tensor) tvm.relax.frontend.nn.core.Tensor

Forward method for conv1d layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the conv1d layer.

Return type

Tensor

class tvm.relax.frontend.nn.ConvTranspose1D(in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, padding: int = 0, output_padding: int = 0, dilation: int = 1, groups: int = 1, bias: bool = True, dtype: Optional[str] = None)

relax.frontend.nn.Module for ConvTranspose1D layer.

forward(x: tvm.relax.frontend.nn.core.Tensor) tvm.relax.frontend.nn.core.Tensor

Forward method for conv transpose 1d layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the conv transpose 1d layer.

Return type

Tensor

class tvm.relax.frontend.nn.Embedding(num: Union[int, str, tvm.ir.expr.PrimExpr], dim: Union[int, str, tvm.ir.expr.PrimExpr], dtype: Optional[str] = None)

relax.frontend.nn.Module for embedding layer.

forward(x: tvm.relax.frontend.nn.core.Tensor)

Forward method for embedding layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the embedding layer.

Return type

Tensor

class tvm.relax.frontend.nn.GroupNorm(num_groups: int, num_channels: int, eps: float = 1e-05, affine: bool = True, dtype: Optional[str] = None)

relax.frontend.nn.Module for group norm layer.

forward(x: tvm.relax.frontend.nn.core.Tensor, channel_axis: int = 1, axes: Optional[List[int]] = None)

Forward method for group norm layer.

Parameters
  • x (Tensor) – The input tensor.

  • channel_axis (int) – Channel axis of the input data.

  • axes (Optional[List[int]]) – Optional list of axes to compute norm over, if not specified, assumes that the first two axes should be left alone.

Returns

ret – The output tensor for the group norm layer.

Return type

Tensor

class tvm.relax.frontend.nn.IOEffect

Modeling IO side effect, for example, printing the content of NDArrays on screen, inserting debug breakpoints, etc.

emit_init(name_hint, builder: tvm.relax.block_builder.BlockBuilder) List[tvm.relax.expr.DataflowVar]

Emit the initialization of the effect. This method is called by the compiler to initialize the effect.

create(name_hint: str) List[tvm.relax.expr.Var]

Create the implicit inputs to a relax.Function that represents the side effect

set_state(state_vars: List[tvm.relax.expr.Var]) None

Set the variables that represents the effect

finalize() List[tvm.relax.expr.Var]

finalize the effect as the implicit return value of a relax.Function

class tvm.relax.frontend.nn.KVCache(init_seq_len: int, unit_shape: Sequence[int], dtype: Optional[str] = None)

Effect to implement KVCache.

emit_init(name_hint: str, bb: tvm.relax.block_builder.BlockBuilder)

Emit the initialization of the KVCache effect.

Parameters
  • name_hint (str) – The name hint of the initialization binding Var.

  • bb (relax.BlockBuilder) – The relax BlockBuilder to emit.

create(name_hint: str) List[tvm.relax.expr.Var]

Create the implicit inputs to a relax.Function that represents the KVCache effect.

Parameters

name_hint (str) – The name hint of the relax.Var.

Returns

ret – The relax.Var for KVCache.

Return type

List[relax.Var]

set_state(state_vars: List[tvm.relax.expr.Var]) None

Set the variables that represents the effect

finalize() List[tvm.relax.expr.Var]

Finalize the KVCache effect as the implicit return value of a relax.Function.

Returns

ret – The output relax.Var as KVCache.

Return type

List[rx.Var]

to(dtype: Optional[str] = None) None

Convert the KVCache effect to specific dtype.

Parameters

dtype (Optional[str]) – The target data type to convert.

view(seq_len: tvm.tir.expr.Var) tvm.relax.frontend.nn.core.Tensor

View the last elements in KVCache.

Parameters

seq_len (tir.Var) – The number of last elements to view.

Returns

ret – The last tensor to view.

Return type

Tensor

append(new_element: tvm.relax.frontend.nn.core.Tensor) None

Append a new element in KVCache.

Parameters

new_element (Tensor) – The new tensor to append.

class tvm.relax.frontend.nn.LayerNorm(normalized_shape: int, eps: Optional[float] = 1e-05, elementwise_affine: bool = True, dtype: Optional[str] = None)

relax.frontend.nn.Module for Layer Normalization

forward(x: tvm.relax.frontend.nn.core.Tensor) tvm.relax.frontend.nn.core.Tensor

Forward method for layer normalization layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the layer normalization layer.

Return type

Tensor

class tvm.relax.frontend.nn.Linear(in_features: Union[int, str, tvm.ir.expr.PrimExpr], out_features: Union[int, str, tvm.ir.expr.PrimExpr], bias: bool = True, dtype: Optional[str] = None, out_dtype: Optional[str] = None)

relax.frontend.nn.Module for linear layer.

forward(x: tvm.relax.frontend.nn.core.Tensor) tvm.relax.frontend.nn.core.Tensor

Forward method for linear layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the linear layer.

Return type

Tensor

to(dtype: Optional[str] = None) None

Override to() such that we do not convert bias if there is out_dtype. Otherwise, we might run into dtype mismatch when computing x + self.bias since x is of type out_dtype and bias becomes dtype, potentially different.

class tvm.relax.frontend.nn.ReLU

relax.frontend.nn.Module for ReLU activation layer.

class tvm.relax.frontend.nn.RMSNorm(hidden_size: int, axes: Union[int, List[int]], epsilon: float = 1e-05, bias: bool = True, dtype: Optional[str] = None)

relax.frontend.nn.Module for rms norm layer.

forward(x: tvm.relax.frontend.nn.core.Tensor)

Forward method for rms norm layer.

Parameters

x (Tensor) – The input tensor.

Returns

ret – The output tensor for the rms norm layer.

Return type

Tensor

class tvm.relax.frontend.nn.SiLU

relax.frontend.nn.Module for SiLU activation layer.

class tvm.relax.frontend.nn.SubroutineMixin

A mixin that generates a

Contains common logic for tvm.relax.frontend.nn.Module and tvm.relax.testing.nn.Module.

class tvm.relax.frontend.nn.Mutator

The mutator for nn.Module transform. Users can override the visit_* methods to apply transform in different structures, or even override the visit method to change the logic of traversal.

visit_module(name: str, node: tvm.relax.frontend.nn.core.Module) Any

The base visiting method for mutation of nn.Module nodes.

Parameters
  • name (str) – The name of the current node in parent’s attribute.

  • node (nn.Module) – The current node of nn.Module to mutate.

Returns

ret_node – The new node to replace current node.

Return type

Any

visit_effect(name: str, node: tvm.relax.frontend.nn.core.Parameter) Any

The base visiting method for mutation of nn.Parameter nodes.

Parameters
  • name (str) – The name of the current node in parent’s attribute.

  • node (nn.Parameter) – The current node of nn.Parameter to mutate.

Returns

ret_node – The new node to replace current node.

Return type

Any

visit_param(name: str, node: tvm.relax.frontend.nn.core.Effect) Any

The base visiting method for mutation of nn.Effect nodes.

Parameters
  • name (str) – The name of the current node in parent’s attribute.

  • node (nn.Effect) – The current node of nn.Effect to mutate.

Returns

ret_node – The new node to replace current node.

Return type

Any

visit_modulelist(name: str, node: tvm.relax.frontend.nn.core.ModuleList) Any

The base visiting method for mutation of nn.ModuleList nodes.

Parameters
  • name (str) – The name of the current node in parent’s attribute.

  • node (nn.ModuleList) – The current node of nn.MoModuleListdule to mutate.

Returns

ret_node – The new node to replace current node.

Return type

Any

visit(name: str, node: Any) Any

The base dispatching method for visiting of all nodes.

Parameters
  • name (str) – The name of the current node in parent’s attribute.

  • node (Any) – The current node to visit.

Returns

ret_node – The new node to replace current node.

Return type

Any

class tvm.relax.frontend.nn.TypeVar(name, *constraints, bound=None, covariant=False, contravariant=False)

Type variable.

Usage:

T = TypeVar('T')  # Can be anything
A = TypeVar('A', str, bytes)  # Must be str or bytes

Type variables exist primarily for the benefit of static type checkers. They serve as the parameters for generic types as well as for generic function definitions. See class Generic for more information on generic types. Generic functions work as follows:

def repeat(x: T, n: int) -> List[T]:

‘’’Return a list containing n references to x.’’’ return [x]*n

def longest(x: A, y: A) -> A:

‘’’Return the longest of two strings.’’’ return x if len(x) >= len(y) else y

The latter example’s signature is essentially the overloading of (str, str) -> str and (bytes, bytes) -> bytes. Also note that if the arguments are instances of some subclass of str, the return type is still plain str.

At runtime, isinstance(x, T) and issubclass(C, T) will raise TypeError.

Type variables defined with covariant=True or contravariant=True can be used to declare covariant or contravariant generic types. See PEP 484 for more details. By default generic types are invariant in all type variables.

Type variables can be introspected. e.g.:

T.__name__ == ‘T’ T.__constraints__ == () T.__covariant__ == False T.__contravariant__ = False A.__constraints__ == (str, bytes)

Note that only type variables defined in global scope can be pickled.

tvm.relax.frontend.nn.add(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'add') tvm.relax.frontend.nn.core.Tensor

Addition with numpy-style broadcasting.

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = add(a, b)
tvm.relax.frontend.nn.argsort(data: tvm.relax.frontend.nn.core.Tensor, axis: int = - 1, descending: bool = False, dtype: str = 'int32', name='argsort')

Performs sorting along the given axis and returns an array of indices having same shape as an input array that index data in sorted order.

Parameters
  • data (Tensor) – The input data tensor.

  • axis (int) – Axis long which to sort the input tensor.

  • descending (bool) – Whether to sort in descending order, the default is False

  • dtype (str) – The data type of the output indices.

  • name (str) – Name hint.

Returns

out – The indices of the sorted tensor.

Return type

Tensor

tvm.relax.frontend.nn.astype(x: tvm.relax.frontend.nn.core.Tensor, dtype: str, name: str = 'astype') tvm.relax.frontend.nn.core.Tensor

Cast input tensor to the given data type.

Parameters
  • x (Tensor) – The input data to the operator.

  • dtype (str) – The target data type

  • name (str) – Name hint.

Returns

result – The casted result.

Return type

Tensor

tvm.relax.frontend.nn.broadcast_to(x: tvm.relax.frontend.nn.core.Tensor, shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], name: str = 'broadcast_to') tvm.relax.frontend.nn.core.Tensor

Broadcasts a tensor to a specified shape.

Parameters
  • x (Tensor) – The input data to the operator.

  • shape (Sequence[IntExpr]) – The target shape.

  • name (str) – Name hint.

Returns

result – The broadcasted tensor.

Return type

Tensor

tvm.relax.frontend.nn.ccl_allgather(x: tvm.relax.frontend.nn.core.Tensor, num_workers: int, name='ccl_allgather')

CCL Allgather operator

Parameters
  • x (relax.Expr) – The input tensor.

  • num_workers (int) – Number of workers.

  • name (str) – Name hint for this operation.

Returns

result – The result tensor of allgather.

Return type

Tensor

tvm.relax.frontend.nn.ccl_allreduce(x: tvm.relax.frontend.nn.core.Tensor, op_type: str = 'sum', in_group: bool = True, name='ccl_allreduce')

CCL Allreduce operator

Parameters
  • x (relax.Expr) – The input tensor.

  • op_type (str) – The type of reduction operation to be applied to the input data. Now “sum”, “prod”, “min”, “max” and “avg” are supported.

  • in_group (bool) – Whether the reduction operation performs globally or in group as default.

  • name (str) – Name hint for this operation.

Returns

result – The result tensor of allreduce.

Return type

Tensor

tvm.relax.frontend.nn.ccl_broadcast_from_worker0(x: tvm.relax.frontend.nn.core.Tensor, name='broadcast_from_worker')

Broadcast data from worker-0 to all other workers.

Parameters
  • x (Tensor) – The tensor to be broadcast.

  • name (str) – Name hint for this operation.

Returns

result – The same tensor, which has been broadcast to all other workers.

Return type

Tensor

tvm.relax.frontend.nn.chunk(x: tvm.relax.frontend.nn.core.Tensor, chunks: int, dim: int = 0, name: str = 'chunk') tvm.relax.frontend.nn.core.Tensor

Split a tensor along dim into the specified number of chunks.

Parameters
  • x (Tensor) – Input tensor to be split.

  • chunks (int) – Number of pieces to slice x into.

  • dim (int) – Which dimension to split x.

  • name (str) – Name hint for this operation.

Returns

result – A tuple with chunks elements containing slices of x.

Return type

Tuple[Tensor]

tvm.relax.frontend.nn.concat(x: List[tvm.relax.frontend.nn.core.Tensor], dim: int, name: str = 'concat') tvm.relax.frontend.nn.core.Tensor

Concatenate a list of tensors along an axis.

Parameters
  • x (List[Tensor]) – List of tensors to concatenate.

  • dim (int) – Dimension to concatenate upon.

  • name (str) – Name hint for this operator.

Returns

result – Expanded result.

Return type

Tensor

tvm.relax.frontend.nn.conv1d(x: tvm.relax.frontend.nn.core.Tensor, weight: tvm.relax.frontend.nn.core.Tensor, bias: Optional[tvm.relax.frontend.nn.core.Tensor] = None, stride: Optional[Union[int, Tuple]] = 1, padding: Optional[Union[int, Tuple, str]] = 0, dilation: Optional[Union[int, Tuple]] = 1, groups: Optional[int] = 1, name: str = 'conv1d') tvm.relax.frontend.nn.core.Tensor

1D convolution.

This operator takes the weight as the 1D convolution kernel and convolves it with data to produce an output.

In the default case, where the data_layout is NCW and kernel_layout is OIW, conv1d takes in a data Tensor with shape (batch_size, in_channels, width), and a weight Tensor with shape (channels, in_channels, kernel_w), where kernel_w is the length of the W kernel dimension, to produce an output Tensor with the following rule:

\[\mbox{out}[b, c, x] = \sum_{dx, k} \mbox{data}[b, k, \mbox{strides} * x + dx] * \mbox{weight}[c, k, dx]\]

Padding and dilation are applied to data and weight respectively before the computation. This operator accepts data layout specification. Semantically, the operator will convert the layout to the canonical layout (NCW for data and OIW for weight), perform the computation, then convert to the out_layout.

Parameters
  • x (Tensor) – The input data to the operator.

  • weight (Tensor) – The weight expressions.

  • bias (Optional[Tensor]) – Optional bias tensor of shape [O].

  • strides (Optional[Union[int, Tuple]]) – The strides of convolution. It is required to have length 1.

  • padding (Optional[Union[int, Tuple, str]]) – The padding of convolution on both sides of inputs before convolution. It is required to have length either 1 or 2.

  • dilation (Optional[Union[int, Tuple]]) – Specifies the dilation rate to be used for dilated convolution. It is required to have length 1.

  • groups (Optional[int]) – Number of groups to split the input into for grouped convolution. The number of input and output channels should be divisible by the number of groups.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.conv1d_transpose(x: tvm.relax.frontend.nn.core.Tensor, weight: tvm.relax.frontend.nn.core.Tensor, bias: Optional[tvm.relax.frontend.nn.core.Tensor] = None, stride: Optional[Union[int, Tuple[int]]] = 1, padding: Optional[Union[int, Tuple[int, ...]]] = 0, output_padding: Optional[Union[int, Tuple[int]]] = 0, dilation: Optional[Union[int, Tuple]] = 1, groups: Optional[int] = 1, name: str = 'conv1d_transpose') tvm.relax.frontend.nn.core.Tensor

1D transposed convolution operator.

This operator can be seen as the gradient operator of conv1d.

The output shape can be explained in the simple case when data_layout == “NCW” and kernel_layout == “IOW”. Suppose data has shape (N, in_channel, in_w), weight has shape (in_channel, out_channel, weight_w), we need to assure that in_channel % groups == 0. The shape of the output will be (N, out_channel * groups, out_w), where

  • out_w = ((in_w - 1) * strides[0] + weight_w - 2 * padding[0] + output_padding[0])

Parameters
  • data (Tensor) – The input data to the operator.

  • weight (Tensor) – The weight tensor.

  • strides (Union[int, Tuple[int]]) – The strides of convolution. It is required to have length 1.

  • padding (Union[int, Tuple[int, ...]]) – The padding of convolution on both sides of inputs before convolution. It is required to have length either 1 or 2.

  • output_padding (Union[int, Tuple[int, ...]], optional) – Used to disambiguate the output shape.

  • dilation (Union[int, Tuple[int]]) – Specifies the dilation rate to be used for dilated convolution. It is required to have length either 1.

  • groups (int) – Number of groups to split the input into for grouped convolution. The number of input and output channels should be divisible by the number of groups.

  • data_layout (str) – Layout of the input.

  • kernel_layout (str) – Layout of the weight.

  • out_layout (Optional[str]) – Layout of the output. If not specified, it is the same as data_layout

  • out_dtype (Optional[Union[str, DataType]]) – Specifies the output data type for mixed precision conv2d.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.conv2d(x: tvm.relax.frontend.nn.core.Tensor, weight: tvm.relax.frontend.nn.core.Tensor, bias: Optional[tvm.relax.frontend.nn.core.Tensor] = None, stride: Optional[Union[int, Tuple]] = 1, padding: Optional[Union[int, Tuple, str]] = 0, dilation: Optional[Union[int, Tuple]] = 1, groups: Optional[int] = 1, data_layout: Optional[str] = 'NCHW', name: str = 'conv2d') tvm.relax.frontend.nn.core.Tensor

Applies a 2D convolution over an input image composed of sevaral input planes

Parameters
  • x (Tensor) – Input tensor of shape [B, N, H, W]

  • weight (Tensor) – Filters of shape [O, N/groups, kH, kW]

  • bias (Optional[Tensor]) – Optional bias tensor of shape [O].

  • stride (Optional[Union[int, Tuple]]) – The stride of the convolving kernel. Can be a single number or tuple of (sH, sW).

  • padding (Optional[[Union[int, Tuple]]]) – Implicit paddings on both sides of the input.

  • dilation (Optional[Union[int, Tuple]]) – The spacing between kernel elements. Can be a single number of tuple (dH, dW).

  • groups (Optional[int]) – Split input into a number of groups.

  • data_layout (Optional[str]) – Layout of input and output data.

  • name (str) – Name hint.

Returns

result – The computed result with shape [B, O, oH, oW].

Return type

Tensor

tvm.relax.frontend.nn.conv3d(x: tvm.relax.frontend.nn.core.Tensor, weight: tvm.relax.frontend.nn.core.Tensor, bias: Optional[tvm.relax.frontend.nn.core.Tensor] = None, stride: Optional[Union[int, Tuple]] = 1, padding: Optional[Union[int, Tuple, str]] = 0, dilation: Optional[Union[int, Tuple]] = 1, groups: Optional[int] = 1, data_layout: Optional[str] = 'NCDHW', name: str = 'conv3d') tvm.relax.frontend.nn.core.Tensor

Applies a 3D convolution over an input image composed of sevaral input planes

Parameters
  • x (Tensor) – Input tensor of shape [B, N, D, H, W]

  • weight (Tensor) – Filters of shape [O, N/groups, kD, kH, kW]

  • bias (Optional[Tensor]) – Optional bias tensor of shape [O].

  • stride (Optional[Union[int, Tuple]]) – The stride of the convolving kernel. Can be a single number or tuple of (sD, sH, sW).

  • padding (Optional[[Union[int, Tuple]]]) – Implicit paddings on both sides of the input.

  • dilation (Optional[Union[int, Tuple]]) – The spacing between kernel elements. Can be a single number of tuple (dD, dH, dW).

  • groups (Optional[int]) – Split input into a number of groups.

  • data_layout (Optional[str]) – Optional layout of the input and output data.

  • name (str) – Name hint.

Returns

result – The computed result with shape [B, O, oD, oH, oW].

Return type

Tensor

tvm.relax.frontend.nn.cumsum(data: tvm.relax.frontend.nn.core.Tensor, axis: Optional[int] = None, dtype: Optional[str] = None, exclusive: Optional[bool] = None, name: str = 'cumsum') tvm.relax.frontend.nn.core.Tensor

Numpy style cumsum op. Return the cumulative inclusive sum of the elements along a given axis.

Parameters
  • data (Tensor) – The input data to the operator.

  • axis (Optional[int]) – Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.

  • dtype (Optional[str]) – Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of data.

  • exclusive (Optional[bool]) – If true will return exclusive sum in which the first element is not included.

  • name (str) – Name hint.

Returns

result – The result has the same size as data, and the same shape as data if axis is not None. If axis is None, the result is a 1-d array.

Return type

Tensor

Examples

a = [[1, 2, 3], [4, 5, 6]]

cumsum(a)  # if axis is not provided, cumsum is done over the flattened input.
-> [ 1,  3,  6, 10, 15, 21]

cumsum(a, dtype="float32")
-> [  1.,   3.,   6.,  10.,  15.,  21.]

cumsum(a, axis=0)  # sum over rows for each of the 3 columns
-> [[1, 2, 3],
    [5, 7, 9]]

cumsum(a, axis=1)
-> [[ 1,  3,  6],
    [ 4,  9, 15]]

a = [1, 0, 1, 0, 1, 1, 0]  # a is a boolean array
cumsum(a, dtype=int32)  # dtype should be provided to get the expected results
-> [1, 1, 2, 2, 3, 4, 4]
tvm.relax.frontend.nn.debug_func(name: str, *args: Union[tvm.relax.frontend.nn.core.Tensor, tvm.ir.expr.PrimExpr, int, float, str], _line_info: Optional[str] = None)

relax.Call a debug function during runtime. The debug function must be registered with the following type signature:

@tvm.register_func(name_of_debug_func)
def debug_func(lineno: str, arg_0, arg_1, ...) -> None:
    ...
Parameters
  • name (str) – The name of the debug function to call.

  • *args (Union[Tensor, _tir.PrimExpr, int, float, str]) – The arguments to pass to the debug function.

tvm.relax.frontend.nn.divide(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'divide') tvm.relax.frontend.nn.core.Tensor

Division with numpy-style broadcasting.

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = divide(a, b)
tvm.relax.frontend.nn.empty(shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], dtype: str = 'float32', name: str = 'empty') tvm.relax.frontend.nn.core.Tensor

Construct an uninitialized tensor, with the input shape and dtype.

Parameters
  • shape (Sequence[IntExpr]) – The shape of the created tensor.

  • dtype (str) – The data type of the created tensor.

  • name (str) – Name hint.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.equal(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'equal') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs == rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.exp(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'exp') tvm.relax.frontend.nn.core.Tensor

Applies the exponential function.

\[\text{Exp}(x) = e^x\]
Parameters
  • x (Tensor) – The input data to the operator.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.extern(name: str, args: Sequence[Union[tvm.relax.frontend.nn.core.Tensor, tvm.ir.expr.PrimExpr, int, float, str]], out: tvm.relax.frontend.nn.op.OutType) tvm.relax.frontend.nn.op.OutType

Invoke an extern function during runtime. The extern function must be registered with the ” TVM runtime using TVM_REGISTER_GLOBAL (C++), or tvm.register_func (Python).

Parameters
  • name (str) – The name of the extern function to call.

  • args (Sequence[Union[Tensor, _tir.PrimExpr, int, float, str]]) – The arguments to pass to the extern function.

  • out (Union[Tensor, List[Tensor]]) – The output tensors, only

Returns

result – The result

Return type

Tensor

tvm.relax.frontend.nn.full(shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], fill_value: tvm.relax.frontend.nn.core.Tensor, dtype: str = 'float32', name: str = 'full') tvm.relax.frontend.nn.core.Tensor

Fill array with scalar value.

Parameters
  • shape (Sequence[IntExpr]) – The shape of the created tensor.

  • fill_value (Tensor) – The value to fill. Must be a scalar tensor.

  • dtype (str) – The data type of the created tensor. If dtype is not given, it will by default use the dtype of fill_value.

  • name (str) – Name hint.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.gelu(x: tvm.relax.frontend.nn.core.Tensor, approximate: Optional[str] = None, name: str = 'gelu') tvm.relax.frontend.nn.core.Tensor

Applies the Gaussian Error Linear Units function

\[\text{GeLU}(x) = 0.5 * x * (1 + \text{erf}(x * 0.5**0.5))\]

where \(erf\) is the Gauss Error function.

Parameters
  • x (Tensor) – The input data

  • approximate (Optional[str]) – If set to tanh, use an approximation when calculating CDF.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.get_default_dtype() str

Get the default parameter dtype if not specified. By default it is float32.

Returns

dtype – The default dtype

Return type

str

tvm.relax.frontend.nn.get_timestep_embedding(x: tvm.relax.frontend.nn.core.Tensor, embedding_dim: int, flip_sin_to_cos: bool = False, downscale_freq_shift: float = 1, scale: float = 1, max_period: int = 10000, name: str = 'get_timestep_embedding') tvm.relax.frontend.nn.core.Tensor

Timestep calculation as described in Denoising Diffusion Probabilistic Models.

Parameters
  • x (Tensor) – A 1-D Tensor of N indices.

  • embedding_dim (int) – The dimension of the output.

  • flip_sin_to_cos (bool) – If True, change the order of sine and cosine embeddings.

  • downscale_freq_shift (float) – Adjusts the frequency of the sinusoidal sampling.

  • scale (float) – Weight adjustment for embedding magnitude.

  • max_period (int) – Controls the minimum frequency of the embeddings.

  • name (str) – The name to label this operator with.

Returns

result – [N x dim] Tensor of positional embeddings.

Return type

Tensor

tvm.relax.frontend.nn.greater(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'greater') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs > rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.greater_equal(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'greater_equal') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs >= rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.group_norm(x: tvm.relax.frontend.nn.core.Tensor, num_groups: int, weight: Optional[tvm.relax.frontend.nn.core.Tensor], bias: Optional[tvm.relax.frontend.nn.core.Tensor], eps: float = 1e-05, channel_axis: int = 1, axes: Optional[List[int]] = None, name: str = 'group_norm') tvm.relax.frontend.nn.core.Tensor

Applies Group Normalization over a mini-batch of inputs as described in the paper Group Normalization

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{relax.Var}[x] + \epsilon}} * \gamma + \beta\]
Parameters
  • x (Tensor) – Input to which rms_norm will be applied.

  • num_groups (int) – Number of groups to separate the channels into.

  • weight (Tensor) – The gamma scale factor.

  • bias (Tensor) – The beta offset factor.

  • epsilon (float) – Small float added to square mean to avoid dividing by zero.

  • channel_axis (int) – The channel axis of the data.

  • axes (Optional[int]) – Which axes to compute the groupnorm over. If None, assumes first two channels should be ignored.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.interpolate(x: tvm.relax.frontend.nn.core.Tensor, size: Optional[Union[int, Tuple[int]]] = None, scale_factor: Optional[Union[float, Tuple[float]]] = None, mode: str = 'nearest', align_corners: Optional[bool] = None, recompute_scale_factor: Optional[bool] = None, antialias: Optional[bool] = None, data_layout: Optional[str] = 'NCHW', name: str = 'interpolate')

Resize a tensor using the specified mode.

Parameters
  • x (Tensor) – Input tensor to be resized.

  • size (Optional[Union[int, Tuple[int]]]) – Requested output size, only one of size and scale_factor may be specified.

  • scale_factor (Optional[Union[float, Tuple[float]]]) – Multiplier for spatial size.

  • mode (str) – Algorithm used for sampling.

  • align_corners (Optional[bool]) – How to map pixels before and after sampling.

  • recompute_scale_factor (Optional[bool]) – Recompute the scale_factor for use in interpolation.

  • antialias (Optional[bool]) – Apply antialiasing to output.

  • data_layout (Optional[str]) – Layout of the input and output data.

  • name (str) – Name hint for this operation.

Returns

result – Output tensor with requested shape.

Return type

Tensor

tvm.relax.frontend.nn.layer_norm(x: tvm.relax.frontend.nn.core.Tensor, normalized_shape: Union[int, List[int]], weight: Optional[tvm.relax.frontend.nn.core.Tensor] = None, bias: Optional[tvm.relax.frontend.nn.core.Tensor] = None, eps: float = 1e-05, name: str = 'layer_norm') tvm.relax.frontend.nn.core.Tensor

Layer normalization (Lei Ba and et al., 2016). Applies layer normalization to the n-dimensional input array. This operator takes an n-dimensional input array and normalizes the input using the given axis:

\[out = \frac{data - mean(data, axis)}{\sqrt{var(data, axis)+\epsilon}} * gamma + beta\]

Unlike batch normalization, the mean and var are computed along the channel dimension.

Assume the input has size k on axis 1, then both gamma and beta have shape (k,).

Note

This operator can be optimized away for inference.

Parameters
  • x (Tensor) – Input to which layer_norm will be applied.

  • normalized_shape (Union[int, List[int]]) – The shape of axes to normalize. If a single integer is used, it is treated as a singleton list and this module will normalize over the last dimension.

  • weight (Tensor) – The gamma scale factor.

  • bias (Tensor) – The beta offset factor.

  • eps (float) – Small float added to variance to avoid dividing by zero.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.less(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'less') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs < rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.less_equal(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'less_equal') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs <= rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.matmul(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, out_dtype: Optional[str] = None, name: str = 'matmul') tvm.relax.frontend.nn.core.Tensor

General matrix multiplication of two tensors, with broadcasting on batched dimensions.

The semantics and output shape deduction rule is specified as https://data-apis.org/array-api/latest/API_specification/generated/array_api.matmul.html.

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • out_dtype (Optional[Union[str, DataType]]) – The data type of the matmul result. When it is not specified, the output dtype will be the same as input dtype.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = matmul(a, b)
tvm.relax.frontend.nn.maximum(x1: tvm.relax.frontend.nn.core.Tensor, x2: tvm.relax.frontend.nn.core.Tensor, name: str = 'maximum')

Element-wise maximum

Parameters
  • x1 (Tensor) – The first input tensor.

  • x2 (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = maximum(a, b)
tvm.relax.frontend.nn.minimum(x1: tvm.relax.frontend.nn.core.Tensor, x2: tvm.relax.frontend.nn.core.Tensor, name: str = 'minimum')

Element-wise minimum

Parameters
  • x1 (Tensor) – The first input tensor.

  • x2 (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = minimum(a, b)
tvm.relax.frontend.nn.multinomial_from_uniform(prob: tvm.relax.frontend.nn.core.Tensor, uniform_sample: tvm.relax.frontend.nn.core.Tensor, sample_indices: Optional[tvm.relax.frontend.nn.core.Tensor] = None, dtype: str = 'int64', name: str = 'multinomial_from_uniform')

Returns a tensor where each row contains the index sampled from the multinomial probability distribution located in the corresponding row of tensor prob.

Notes

For better cpu performance, use ‘vm.builtin.multinomial_from_uniform’. For accurate results, ensure probabilities are between 0 and 1 and sum to 1.

Parameters
  • prob (Tensor) – A 2-D tensor of shape (batch, vocab_size) representing probability distributions. Each row is a distribution across vocabulary for a batch, where: Values range from [0, 1], indicating the probability of each vocabulary item. The sum of values in each row is 1, forming a valid distribution.

  • uniform_sample (Tensor) – The uniformly sampled 2-D tensor with the shape (n, 1). Values range from 0 to 1, indicating probabilities sampled uniformly.

  • sample_indices (Optional[Tensor]) – The 2-D tensor with the shape [n, 1], which indicates the specific probability distribution to sample from. The value of sample_indices[i] determines that the ith token should be sampled from the sample_indices[i]th probability distribution. For instance, if there are 3 distinct probability distributions and the requirement is to sample 2, 3, and 4 tokens from each, then sample_indices would be [0, 0, 1, 1, 1, 2, 2, 2, 2].

  • dtype (str) – The data type of output tensor.

Returns

result – The computed tensor with shape (n, 1).

Return type

Tensor

Examples

prob = [[0.2, 0.3, 0.5], [0.3, 0.4, 0.3]]
usample = [[0.4], [0.9]]
sample_indices = [[0], [1]]

multinomial_from_uniform(prob, usample)
-> [[1], [2]]
multinomial_from_uniform(prob, usample, sample_indices)
-> [[1], [2]]
tvm.relax.frontend.nn.multiply(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'mul') tvm.relax.frontend.nn.core.Tensor

Multiplication with numpy-style broadcasting.

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = multiply(a, b)
tvm.relax.frontend.nn.negative(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'neg') tvm.relax.frontend.nn.core.Tensor

Numerical negative of the input tensor.

Parameters
  • x (Tensor) – The input data to the operator.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.not_equal(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'not_equal') tvm.relax.frontend.nn.core.Tensor

Broadcasted element-wise comparison for (lhs != rhs).

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.ones(shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], dtype: str = 'float32', name: str = 'ones') tvm.relax.frontend.nn.core.Tensor

Construct a tensor of all zeros, with the input shape and dtype.

Parameters
  • shape (Sequence[IntExpr]) – The shape of the created tensor.

  • dtype (str) – The data type of the created tensor.

  • name (str) – Name hint.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.pad(x: tvm.relax.frontend.nn.core.Tensor, pad: List[int], mode: str = 'constant', value: int = 0, name: str = 'pad') tvm.relax.frontend.nn.core.Tensor

Apply spatial padding to the input tensor.

Parameters
  • x (Tensor) – Input tensor to be padded.

  • pad (List[int]) – List in the format of [before_0, after_0, before_1, after_1, …] indicating how much to pad each axis of x.

  • mod (str) – Padding mode to use, constant implies padded elements will use value argument.

  • value (int) – What to pad with in constant mode.

  • name (str) – Name hint for this operator.

Returns

result – Padded output tensor.

Return type

Tensor

tvm.relax.frontend.nn.permute(x: tvm.relax.frontend.nn.core.Tensor, axes: Optional[List[int]], name: str = 'permute') tvm.relax.frontend.nn.core.Tensor

Permutes the dimensions of the input tensor.

Parameters
  • x (Tensor) – The input data to the operator.

  • axes (Optional[List[int]]) – The target axes order.

  • name (str) – Name hint.

Returns

result – The transposed result.

Return type

Tensor

tvm.relax.frontend.nn.permute_dims(x: tvm.relax.frontend.nn.core.Tensor, axes: Optional[List[int]] = None, name: Optional[str] = None) tvm.relax.frontend.nn.core.Tensor

Permutes the dimensions of an array.

Parameters
  • x (Tensor) – The input data to the operator.

  • axes (Optional[List[int]]) – The target axes order, reverse order if not specified.

  • name (str) – Name hint.

Returns

result – The transposed result.

Return type

Tensor

tvm.relax.frontend.nn.print_(tensor: tvm.relax.frontend.nn.core.Tensor)

Debug printing a Tensor during runtime.

tvm.relax.frontend.nn.relu(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'relu') tvm.relax.frontend.nn.core.Tensor

Rectified Linear Unit (ReLU) activation function.

\[ext{ReLU}(x) = ext{max}(x, 0)\]
Parameters
  • x (Tensor) – The input data.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.renormalize_top_p_top_k_prob(prob, sorted_prob, top_p, top_k)

Renormalizes probabilities after filtering with top_p and top_k, ensuring they sum up to 1.

Notes

For accurate results, ensure probabilities are between 0 and 1 and sum to 1.

Parameters
  • prob (Tensor) – A 2-D tensor of shape (batch, vocab_size) representing probability distributions.

  • sorted_prob (Tensor) – Probabilities sorted in descending order.

  • top_p (Tensor) – The cumulative probability threshold with shape (batch, 1) for nucleus sampling.

  • top_k (Tensor) – A tensor with shape (batch, 1), representing the number of top probabilities to consider for top-k sampling.

Returns

result – The filtered and nomalized tensor with the sampe shape as input prob.

Return type

Tensor

tvm.relax.frontend.nn.repeat(x: tvm.relax.frontend.nn.core.Tensor, repeats: int, axis: Optional[int] = None, name='repeat') tvm.relax.frontend.nn.core.Tensor

Repeats elements of an array.

Parameters
  • data (Tensor) – The input tensor.

  • repeats (int) – The number of repetitions.

  • axis (Optional[int]) – The axis along which to repeat values. The negative numbers are interpreted counting from the backward. By default, use the flattened input array, and return a flat output array.

  • name (str) – Name hint.

Returns

ret – The computed result.

Return type

Tensor

Examples

np_x = numpy.array([[1, 2], [3, 4]])
x = Tensor.from_const(np_x)
lv1 = repeat(x, repeats=2) # lv1 == [1, 1, 2, 2, 3, 3, 4, 4]
lv2 = repeat(x, repeats=2, axis=1)   # lv2 == [[1., 1., 2., 2.],
                                     #         [3., 3., 4., 4.]]
tvm.relax.frontend.nn.reshape(x: tvm.relax.frontend.nn.core.Tensor, shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], name='reshape') tvm.relax.frontend.nn.core.Tensor

Reshape the input array.

-1 infers the dimension of the output shape by using the remainder of the input dimensions keeping the size of the new array same as that of the input array. At most one dimension of shape can be -1.

x.shape = (2, 3, 4), shape = (6, 1, -1), result.shape = (6, 1, 4)
x.shape = (2, 3, 4), shape = (3, -1, 8), result.shape = (3, 1, 8)
x.shape = (2, 3, 4), shape = (-1,), result.shape = (24,)
Parameters
  • x (Tensor) – The input data to the operator.

  • shape (Sequence[IntExpr]) – The new shape. Should be compatible with the original shape.

  • name (str) – Name hint.

Returns

result – The reshaped result.

Return type

Tensor

Note

The -1 inference is only performed at compile-time. That is to say, in any case the dimension length of -1 cannot be inferred in compile-time, an error will be thrown.

tvm.relax.frontend.nn.rms_norm(x: tvm.relax.frontend.nn.core.Tensor, weight: tvm.relax.frontend.nn.core.Tensor, axes: Union[int, List[int]], epsilon: float = 1e-05, name: str = 'rms_norm') tvm.relax.frontend.nn.core.Tensor

Root mean square normalization (Biao Zhang and et al., 2019). Applies root mean square normalization to the n-dimensional input array. This operator takes an n-dimensional input array and normalizes the input using the given axis:

\[out = \frac{data}{\sqrt{mean(data, axis)+\epsilon}} * weight\]
Parameters
  • data (Tensor) – Input to which rms_norm will be applied.

  • weight (Tensor) – The scale factor.

  • axes (Union[int, List[int]]) – The axes that along which the normalization is applied.

  • epsilon (float) – Small float added to square mean to avoid dividing by zero.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.sample_top_p_top_k_from_sorted_prob(sorted_prob: tvm.relax.frontend.nn.core.Tensor, sorted_index: tvm.relax.frontend.nn.core.Tensor, top_p: tvm.relax.frontend.nn.core.Tensor, top_k: tvm.relax.frontend.nn.core.Tensor, uniform_sample: tvm.relax.frontend.nn.core.Tensor, sample_indices: Optional[tvm.relax.frontend.nn.core.Tensor] = None)

Samples indices from a sorted probability tensor based on top_p and top_k criteria.

Notes

For accurate results, ensure probabilities are between 0 and 1 and sum to 1.

Parameters
  • sorted_prob (Tensor) – A 2-D tensor, with shape (batch, vocab_size), contains probabilities sorted in descending order.

  • sorted_index (Tensor) – The indices tensor with shape (batch, vocab_size), corresponding to the sorted_prob. Potentially from applying argsort on the original probability tensor in descending order.

  • top_p (Tensor) – The cumulative probability threshold with shape (batch, 1) for nucleus sampling.

  • top_k (Tensor) – A tensor with shape (batch, 1), representing the number of top probabilities to consider for top-k sampling.

  • uniform_sample (Tensor) – Uniformly sampled values with shape (n, 1) are used to select the output indices.

  • sample_indices (Optional[Tensor]) – The 2-D tensor with the shape [n, 1], which indicates the specific probability distribution to sample from. The value of sample_indices[i] determines that the ith token should be sampled from the sample_indices[i]th probability distribution. For instance, if there are 3 distinct probability distributions and the requirement is to sample 2, 3, and 4 tokens from each, then sample_indices would be [0, 0, 1, 1, 1, 2, 2, 2, 2].

Returns

result – The selected indices with shape (n, 1).

Return type

Tensor

Examples

prob = [[0.1 , 0.4, 0.5],
        [0.3, 0.3, 0.4]]
sorted_prob = [[0.5, 0.4, 0.1],
               [0.4, 0.3, 0.3]]
sorted_index = [[2, 1, 0],
                [2, 0, 1]]
top_p = [[0.6],[0.9]]
top_k = [[3],[2]]
uniform_sample = [[0.5], [0.6]]
sample_indices = [[0], [1]]

sample_top_p_top_k_from_sorted_prob(
    sorted_prob, sorted_index,top_p, top_k, uniform_sample, sample_indices)
-> [2, 0]
tvm.relax.frontend.nn.scaled_dot_product_attention(query: tvm.relax.frontend.nn.core.Tensor, key: tvm.relax.frontend.nn.core.Tensor, value: tvm.relax.frontend.nn.core.Tensor, attn_mask: Optional[tvm.relax.frontend.nn.core.Tensor] = None, is_causal: Optional[bool] = False, scale: Optional[float] = None, name: str = 'scaled_dot_product_attention')

Computes a scaled dot product attention on provided attention query, key, and values. Compliant with the functional torch implementation.

Parameters
  • query (Tensor) – Tensor representing current attention lookup of shape [batch, seq_len, num_heads, head_size].

  • key (Tensor) – Tensor representing cross attention mapping of shape [batch, seq_len_kv, num_heads_kv, head_size].

  • value (Tensor) – Tensor representing embedded attention values of shape [batch, seq_len_kv, num_heads_kv, head_size_value].

  • attn_mask (Optional[Tensor]) – Optional mask for attention, not yet supported.

  • is_causal (Optional[bool]) – If set, uses a causal attention mask.

  • scale (Optional[float]) – Optional extra scaling argument applied to attention.

  • name (str) – Name hint for this function.

tvm.relax.frontend.nn.sigmoid(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'sigmoid') tvm.relax.frontend.nn.core.Tensor

Computes sigmoid.

\[\text{sigmoid}(x) = \frac{1}{1 + \exp(-x)}\]
Parameters
  • data (Tensor) – The input data to the operator.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.silu(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'silu') tvm.relax.frontend.nn.core.Tensor

Sigmoid Linear Unit function

\[\text{SiLU}(x) = x * \text{sigmoid}(x)\]
Parameters
  • data (Tensor) – The input data

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.softmax(x: tvm.relax.frontend.nn.core.Tensor, axis: int = - 1, name: str = 'softmax') tvm.relax.frontend.nn.core.Tensor

Computes softmax.

\[\text{softmax}(x)_i = \frac{\exp(x_i)}{\sum_j \exp(x_j)}\]
Parameters
  • data (Tensor) – The input data to the operator.

  • axis (int) – The axis to sum over when computing softmax. If not specified, it is by default the last axis of the input tensor. Supports negative indexing.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.sort(x: tvm.relax.frontend.nn.core.Tensor, axis: int = - 1, descending: bool = False, name='sort')

Performs sorting along the given axis and returns an array in sorted order.

Parameters
  • x (Tensor) – The input tensor.

  • axis (int) – Axis along which to sort the input tensor. By default the last axis of the input is used.

  • descending (bool) – Whether to sort in descending order, the default is False

  • name (str) – Name hint.

Returns

out – The sorted tensor.

Return type

Tensor

tvm.relax.frontend.nn.split(ary: tvm.relax.frontend.nn.core.Tensor, indices_or_sections: Union[int, Sequence[int]], axis: int = 0, name: str = 'split') Tuple[tvm.relax.frontend.nn.core.Tensor, ...]

Split an array into multiple sub-arrays.

Parameters
  • ary (Tensor) – Input tensor to be split.

  • indices_or_sections (Union[int, Sequence[int]]) – Indices or sections to split into.

  • axis (int = 0) – The axis along which to split, default is 0.

  • name (str) – Name hint.

Returns

result – A list of sub-arrays as the outcome of splitting.

Return type

Tuple[Tensor, …]

tvm.relax.frontend.nn.sqrt(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'sqrt') tvm.relax.frontend.nn.core.Tensor

Computes the element-wise sqrt of the input tensor.

Parameters
  • x (Tensor) – The input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.square(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'square') tvm.relax.frontend.nn.core.Tensor

Computes the element-wise square of the input tensor.

Parameters
  • x (Tensor) – The input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.squeeze(x: tvm.relax.frontend.nn.core.Tensor, axis: int = - 1, name: str = 'squeeze') tvm.relax.frontend.nn.core.Tensor

Squeeze axes in the array.

Parameters
  • x (Tensor) – The input data to the operator.

  • axis (Optional[Union[int, List[int]]) – The set of axes to remove. If axis = None, remove all axis of dimensions 1. If any specified axis has dimension that does not equal 1, it is an error.

  • name (str) – Name hint.

Returns

result – The squeezed result.

Return type

Tensor

tvm.relax.frontend.nn.subtract(a: tvm.relax.frontend.nn.core.Tensor, b: tvm.relax.frontend.nn.core.Tensor, name: str = 'subtract') tvm.relax.frontend.nn.core.Tensor

Subtraction with numpy-style broadcasting.

Parameters
  • a (Tensor) – The first input tensor.

  • b (Tensor) – The second input tensor.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Examples

c = subtract(a, b)
tvm.relax.frontend.nn.sum(x: tvm.relax.frontend.nn.core.Tensor, axis: Optional[Union[int, List[int]]] = None, keepdims: bool = False, name: str = 'sum') tvm.relax.frontend.nn.core.Tensor

Computes the sum of tensor elements over given axes.

Parameters
  • x (Tensor) – The input data tensor

  • axis (Optional[Union[int, List[int]]]) – Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input tensor. Negative indexing is supported.

  • keepdims (bool) – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input tensor.

  • name (str) – Name hint for this operation.

Returns

result – The computed result.

Return type

Tensor

tvm.relax.frontend.nn.take(x: tvm.relax.frontend.nn.core.Tensor, indices: tvm.relax.frontend.nn.core.Tensor, axis: Optional[int] = None, name='take') tvm.relax.frontend.nn.core.Tensor

Take elements from a tensor along an axis. Its semantic is mostly similar to numpy.take (https://numpy.org/doc/stable/reference/generated/numpy.take.html), which can cover torch.take (https://pytorch.org/docs/stable/generated/torch.take.html) and onnx.gather (https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Gather-13).

Parameters
  • x (Tensor) – The source tensor.

  • indices (Tensor) – The indices of the values to extract.

  • axis (Optional[int]) – The axis over which to select values. If it is none, the input tensor is required to be one-dimensional.

  • name (str) – Name hint.

Returns

ret – The taken result.

Return type

Tensor

tvm.relax.frontend.nn.tanh(x: tvm.relax.frontend.nn.core.Tensor, name: str = 'tanh') tvm.relax.frontend.nn.core.Tensor

Applies the hyperbolic tangent function.

\[\text{Tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}\]
Parameters
  • x (Tensor) – The input data to the operator.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Tensor

Note

The input tensor is required to have float dtype

tvm.relax.frontend.nn.tensor_expr_op(tensor_expr_func: Callable, name_hint: str, args: List[Union[tvm.relax.frontend.nn.core.Tensor, tvm.tir.expr.Var, int]], *, attrs: Optional[Dict[str, Any]] = None)

Build the given tensor_expr_func with te.

Parameters
  • tensor_expr_func (Callable) – A function that returns a te tensor or a list of tensors.

  • name_hint (str) – Name hint.

  • args (List[Union[Tensor, _tir.Var]]) – Arguments passed to the function.

  • attrs (Optional[Dict[str, Any]]) – A dict of attributes to apply to the function.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.tensor_ir_inplace_op(func: tvm.tir.function.PrimFunc, name_hint: str, args: Union[tvm.relax.frontend.nn.core.Tensor, Sequence[Union[tvm.relax.frontend.nn.core.Tensor, tvm.relax.expr.ShapeExpr, tvm.ir.expr.PrimExpr]]], inplace_indices: Union[int, List[int]], out: tvm.relax.frontend.nn.op.OutType) tvm.relax.frontend.nn.op.OutType

Create a call_tir_inplace binding with given PrimFunc

Parameters
  • func (_tir.PrimFunc) – The PrimFunc to call.

  • name_hint (str) – Name hint.

  • args (Union[Tensor, Sequence[Union[Tensor, rx.ShapeExpr, _tir.PrimExpr]]]) – The arguments to pass to the PrimFunc.

  • inplace_indices (Union[int, List[int]]) – Specify which arguments should be used for in-place computations. If inplace_indices is a single integer, it will be made into a singleton list. Suppose inplace_indices[i] = j, where j >= 0. Then the i`th output will be an alias of `args[j]. If inplace_indices[i] = -1, then the i`th output will be a freshly allocated tensor. At least one member of `inplace_indices must not be -1.

  • out (Union[Tensor, List[Tensor]]) – The output tensors.

Returns

result – The result tensor

Return type

Tensor

tvm.relax.frontend.nn.tensor_ir_op(func: tvm.tir.function.PrimFunc, name_hint: str, args: Union[tvm.relax.frontend.nn.core.Tensor, Sequence[Union[tvm.relax.frontend.nn.core.Tensor, tvm.relax.expr.ShapeExpr, tvm.ir.expr.PrimExpr]]], out: tvm.relax.frontend.nn.op.OutType) tvm.relax.frontend.nn.op.OutType

Create a call_tir binding with given PrimFunc

Parameters
  • func (_tir.PrimFunc) – The PrimFunc to call.

  • name_hint (str) – Name hint.

  • args (Union[Tensor, Sequence[Union[Tensor, rx.ShapeExpr, _tir.PrimExpr]]]) – The arguments to pass to the PrimFunc.

  • out (Union[Tensor, List[Tensor]]) – The output tensors.

Returns

result – The result tensor

Return type

Tensor

tvm.relax.frontend.nn.topk(data: tvm.relax.frontend.nn.core.Tensor, k: int = 1, axis: int = - 1, ret_type: str = 'both', largest: bool = True, dtype: str = 'int32', name: str = 'topk')

Get the top k elements in an input tensor along the given axis.

ret_type specifies the return type, can be one of (“both”, “values”, “indices”).

Parameters
  • data (Tensor) – The input data tensor.

  • k (int) – Number of top elements to select. Return all elements if k < 1.

  • axis (int) – Axis long which to sort the input tensor.

  • ret_type (str) – The return type [both, values, indices]. “both”: return both top k data and indices. “values”: return top k data only. “indices”: return top k indices only.

  • largest (bool) – Whether to return largest or smallest elements. The k smallest elements are returned if largest is False.

  • dtype (str) – The data type of the indices output.

  • name (str) – Name hint.

Returns

out – The computed result.

Return type

Tensor or Tuple[Tensor, Tensor]

tvm.relax.frontend.nn.triu(x: tvm.relax.frontend.nn.core.Tensor, diagonal: int = 0, name: str = 'triu') tvm.relax.frontend.nn.core.Tensor

Return the upper triangular part of a matrix or a batch of matrices.

Parameters
  • x (Tensor) – The tensor that triu will be applied to. It is required to have at least two dimensions.

  • k (int) – The index indicating the diagonal below which to zero elements. If k = 0, the diagonal is the main diagonal. If k < 0, the diagonal is below the main diagonal. If k > 0, the diagonal is above the main diagonal.

  • name (str) – Name hint.

Returns

ret – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.unsqueeze(x: tvm.relax.frontend.nn.core.Tensor, dim: int, name: str = 'unsqueeze') tvm.relax.frontend.nn.core.Tensor

Add a new axis to a tensor

Parameters
  • x (Tensor) – Input tensor to expand.

  • dim (int) – Dimension to expand.

  • name (str) – Name hint for this operator.

Returns

result – Expanded result.

Return type

Tensor

tvm.relax.frontend.nn.where(condition: tvm.relax.frontend.nn.core.Tensor, x1: tvm.relax.frontend.nn.core.Tensor, x2: tvm.relax.frontend.nn.core.Tensor, name: str = 'where') tvm.relax.frontend.nn.core.Tensor

Selecting elements from either the input tensors depending on the value of the condition.

For a given position, return the corresponding value in x1 if condition is True, and return the corresponding value in x2 otherwise.

Parameters
  • condition (Tensor) – When True, yield x1; otherwise, yield x2. Must be broadcasting compatible with x1 and x2. Must have boolean dtype.

  • x1 (Tensor) – The first input tensor. Must be broadcasting compatible with condition and x2.

  • x2 (Tensor) – The second input tensor. Must be broadcasting compatible with condition and x1.

  • name (str) – Name hint.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.nn.wrap_nested(expr: tvm.ir.expr.RelayExpr, name: str) Union[tvm.relax.frontend.nn.core.Tensor, Sequence[tvm.relax.frontend.nn.core.Tensor]]

Wrap the given relax.Expr, emit it using the current BlockBuilder, and automatically handle nested cases if the expr represents a Tuple.

Parameters
  • expr (relax.Expr) – The Expr to be wrapped.

  • name (str) – Name hint.

Returns

result – The computed result.

Return type

Union[Tensor, Tuple[Tensor]]

tvm.relax.frontend.nn.zeros(shape: Sequence[Union[int, tvm.ir.expr.PrimExpr]], dtype: str = 'float32', name: str = 'zeros') tvm.relax.frontend.nn.core.Tensor

Construct a tensor of all zeros, with the input shape and dtype.

Parameters
  • shape (Sequence[IntExpr]) – The shape of the created tensor.

  • dtype (str) – The data type of the created tensor.

  • name (str) – Name hint.

Returns

result – The result tensor.

Return type

Tensor

tvm.relax.frontend.onnx

Tools for converting ONNX graphs into Relax graphs.

tvm.relax.frontend.onnx.from_onnx(model: onnx.onnx_ml_pb2.GraphProto, shape_dict: Optional[Dict[str, List]] = None, dtype_dict: Optional[Union[str, Dict[str, str]]] = 'float32', opset: Optional[int] = None, keep_params_in_input: bool = False, sanitize_input_names: bool = True) Tuple[tvm.ir.module.IRModule, Dict]

Convert a ONNX model into an equivalent Relax Function. ONNX graphs are represented as Python Protobuf objects.

The current implementation assumes that the input model is after ONNX v1.1.0.

Parameters
  • model (protobuf object) – ONNX ModelProto after ONNX v1.1.0

  • shape_dict (dict of str to tuple, optional) – The input shape to the graph

  • dtype_dict (str or dict of str to str, optional) – The input types to the graph

  • opset (int, optional) – Override to autodetected opset. This can be helpful for some testing.

  • keep_params_in_input (bool) – If True, parameters will be treated as input variables. If false, parameters are treated as constant and folded directly into the graph.

  • sanitize_input_names (bool, optional) – Whether to sanitize the input names to ensure they are valid Relax identifiers.

Returns

  • mod (tvm.IRModule) – The relax module for compilation

  • params (dict of str to tvm.nd.NDArray) – The parameter dict to be used by relax

tvm.relax.frontend.stablehlo

StableHLO Frontends for constructing Relax programs, with the model importers

tvm.relax.frontend.stablehlo.from_stablehlo(stablehlo_module, input_info: Optional[List[Tuple[Tuple[int], str]]] = None) tvm.ir.module.IRModule

Convert a StableHLO Module to a Relax program

Parameters
  • stablehlo_module (Union[str, mlir.ir.Module]) – The StableHLO Module to convert.

  • input_info (List[Tuple[Tuple[int], str]]) – A list of shapes and data types of input tensors.

Returns

output – The result IRModule with entry function “main”

Return type

tvm.IRModule

tvm.relax.frontend.torch

PyTorch Frontends for constructing Relax programs, with the model importers

tvm.relax.frontend.torch.from_fx(model, input_info: List[Tuple[Tuple[int], str]], *, keep_params_as_input: bool = False, unwrap_unit_return_tuple: bool = False, no_bind_return_tuple: bool = False, custom_convert_map: Optional[dict] = None) tvm.ir.module.IRModule

Convert a PyTorch FX GraphModule to a Relax program

Parameters
  • model (fx.GraphModule) – The PyTorch FX GraphModule to convert.

  • input_info (List[Tuple[Tuple[int], str]]) – A list of shapes and data types of input tensors.

  • keep_params_as_input (bool) – Whether to keep model parameters as input variables.

  • unwrap_unit_return_tuple (bool) – A boolean flag indicating if to the return value when it is an unit tuple. When the return value is not a unit tuple, no unwrap will take place.

  • no_bind_return_tuple (bool) – A boolean flag indicating whether to bind the return tuple as a relax var. If the flag is true and the return value is a tuple, it will not bind it to a var.

  • custom_convert_map (Dictionary of str to Relax op) – A custom op conversion map in the same format as TorchFXImporter.convert_map

Returns

output – The import result IRModule, with the function “main” containing the translated logic. If keep_params_as_input is true, the “main” function have an attribute “params” that contains the weights of the input model. The weights can be detached by relax.frontend.detach_params.

Return type

tvm.IRModule

Examples

Users can use the FX tracer or dynamo.export() to extract a fx.GraphModule from a PyTorch model. The following codes show how to convert a PyTorch model to a Relax program.

# Import the importer.
import numpy as np
import torch
from tvm.relax.frontend.torch_fx import from_fx
from torch import _dynamo as dynamo

# Define the module
class MyModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(in_features=10, out_features=7, bias=True)

    def forward(self, input):
        return self.linear(input)

# Instantiate the model and create the input info dict.
torch_model = MyModule()
input_info = [((128, 10), "float32")]
input_tensors = [
    torch.astensor(np.random.randn(*shape).astype(dtype))
    for shape, dtype in input_info
]

# Use FX tracer to trace the PyTorch model.
graph_module = fx.symbolic_trace(torch_model)

# Use the dynamo.export() to export the PyTorch model to FX.
try:
    graph_module = dynamo.export(torch_model, *input_tensors)
except:
    raise RuntimeError("Failed to export the PyTorch model to FX.")

# Use the importer to import the PyTorch model to Relax.
mod: tvm.IRModule = from_fx(graph_module, input_info)

# Print out the imported model.
print(mod.script())

Notes

For a given PyTorch model, to lookup the names of the model inputs in FX, one can use

fx.symbolic_trace(model).graph.print_tabular()

to print out the tabular representation of the PyTorch module, and then check the placeholder rows in the beginning of the tabular.

tvm.relax.frontend.torch.relax_dynamo(pipeline: Optional[tvm.ir.transform.Pass] = None)

A helper function to create a relax backend.

Parameters

pipeline (Optional[tvm.transform.Pass]) – The pipeline to be applied to the relax module before sent to build.

Returns

backend – The relax dynamo backend.

Return type

Callable[[torch.fx.GraphModule, List[torch.Tensor]], Callable]

tvm.relax.frontend.torch.dynamo_capture_subgraphs(model, *params, **kwargs) tvm.ir.module.IRModule

Capture subgraphs of the PyTorch model using torch.compile into an IRModule.

Parameters
  • model (torch.nn.Module) – The PyTorch model to be captured.

  • params (List[torch.Tensor]) – The parameters of the PyTorch model.

  • keep_params_as_input (bool) – Whether to keep model parameters as input variables of the captured Relax functions.

Returns

output – The output of translation, including the translated IRModule. If keep_params_as_input is true, the functions in the IRModule have an attribute “params” that contains the weights of the input model. The weights can be detached by relax.frontend.detach_params.

Return type

ImporterOutput