tvm.te¶

Namespace for Tensor Expression Language

Functions:

`any`(*args[, span])	Create a new experssion of the union of all conditions in the arguments
`all`(*args[, span])	Create a new expression of the intersection of all conditions in the
`min_value`(dtype[, span])	minimum value of dtype
`max_value`(dtype[, span])	maximum value of dtype
`trace`(args[, trace_action])	Trace tensor data at the runtime.
`exp`(x)	Take exponential of input x.
`erf`(x)	Take gauss error function of the input x.
`tanh`(x)	Take hyperbolic tanh of input x.
`sigmoid`(x)	Quick function to get sigmoid
`log`(x)	Take log of input x.
`tan`(x)	Take tan of input x.
`cos`(x)	Take cos of input x.
`sin`(x)	Take sin of input x.
`sqrt`(x)	Take square root of input x.
`rsqrt`(x)	Take reciprocal of square root of input x.
`floor`(x[, span])	Take floor of float input x.
`ceil`(x[, span])	Take ceil of float input x.
`sinh`(x)	Take sinh of input x.
`cosh`(x)	Take cosh of input x.
`log2`(x)	Take log2 of input x.
`log10`(x)	Take log10 of input x.
`asin`(x)	Take asin of input x.
`asinh`(x)	Take asinh of input x.
`acos`(x)	Take acos of input x.
`acosh`(x)	Take acos of input x.
`atan`(x)	Take atan of input x.
`atanh`(x)	Take atanh of input x.
`trunc`(x[, span])	Get truncated value of the input.
`abs`(x[, span])	Get absolute value of the input element-wise.
`round`(x[, span])	Round elements of the array to the nearest integer.
`nearbyint`(x[, span])	Round elements of the array to the nearest integer.
`power`(x, y[, span])	x power y
`popcount`(x)	Count the number of set bits in input x.
`fmod`(x, y)	Return the remainder of x divided by y with the same sign as x.
`if_then_else`(cond, t, f[, span])	Conditional selection expression.
`isnan`(x[, span])	Check if input value is Nan.
`isfinite`(x[, span])	Check if input value is finite.
`isinf`(x[, span])	Check if input value is infinite.
`div`(a, b[, span])	Compute a / b as in C/C++ semantics.
`indexdiv`(a, b[, span])	Compute floor(a / b) where a and b are non-negative.
`indexmod`(a, b[, span])	Compute the remainder of indexdiv.
`truncdiv`(a, b[, span])	Compute the truncdiv of two expressions.
`truncmod`(a, b[, span])	Compute the truncmod of two expressions.
`floordiv`(a, b[, span])	Compute the floordiv of two expressions.
`floormod`(a, b[, span])	Compute the floormod of two expressions.
`comm_reducer`(fcombine, fidentity[, name])	Create a commutative reducer for reduction.
`min`(expr, axis[, where, init])	Create a min expression over axis.
`max`(expr, axis[, where, init])	Create a max expression over axis.
`sum`(expr, axis[, where, init])	Create a sum expression over axis.
`add`(lhs, rhs[, span])	Generic add operator.
`subtract`(lhs, rhs[, span])	Generic subtract operator.
`multiply`(lhs, rhs[, span])	Generic multiply operator.
`create_schedule`(ops)	Create a schedule for list of ops
`decl_tensor_intrin`(op, fcompute[, name, ...])	Declare a tensor intrinsic function.
`tag_scope`(tag)	The operator tag scope.
`placeholder`(shape[, dtype, name])	Construct an empty tensor object.
`compute`(shape, fcompute[, name, tag, attrs, ...])	Construct a new tensor by computing over the shape domain.
`scan`(init, update, state_placeholder[, ...])	Construct new tensors by scanning over axis.
`extern`(shape, inputs, fcompute[, name, ...])	Compute several tensors via an extern function.
`var`([name, dtype, span])	Create a new variable with specified name and dtype
`size_var`([name, dtype, span])	Create a new variable represents a tensor shape size, which is non-negative.
`const`(value[, dtype, span])	Create a new constant with specified value and dtype
`thread_axis`([dom, tag, name, span])	Create a new IterVar to represent thread index.
`reduce_axis`(dom[, name, thread_tag, span])	Create a new IterVar for reduction.
`create_prim_func`(ops[, index_dtype_override])	Create a TensorIR PrimFunc from tensor expression
`extern_primfunc`(input_tensors, primfunc, ...)	Compute tensors via a schedulable TIR PrimFunc
`gradient`(output, inputs[, head])	Perform reverse-mode automatic differentiation.

Classes:

`Schedule`	Schedule for all the stages.
`Stage`	A Stage represents schedule for one operation.
`SpecializedCondition`(conditions)	Specialized condition to enable op specialization.
`TensorSlice`(tensor, indices)	Auxiliary data structure for enable slicing syntax from tensor.
`Tensor`	Tensor object, to construct, see function.Tensor
`PlaceholderOp`	Placeholder operation.
`ComputeOp`	Scalar operation.
`TensorComputeOp`	Tensor operation.
`ScanOp`	Scan operation.
`ExternOp`	External operation.
`HybridOp`	Hybrid operation.

tvm.te.any(*args, span=None)¶

Create a new experssion of the union of all conditions in the arguments

Parameters

args (list) – List of symbolic boolean expressions
span (Optional[Span]) – The location of this operator in the source code.

Returns

expr – Expression

Return type

Expr

Alias of tvm.tir.any()

tvm.te.all(*args, span=None)¶

Create a new expression of the intersection of all conditions in the: arguments

Parameters

args (list) – List of symbolic boolean expressions
span (Optional[Span]) – The location of this operator in the source code.

Returns

expr – Expression

Return type

Expr

Alias of tvm.tir.all()

tvm.te.min_value(dtype, span=None)¶

minimum value of dtype

Parameters

dtype (str) – The data type.
span (Optional[Span]) – The location of this operator in the source code.

Returns

value – The minimum value of dtype.

Return type

tvm.Expr

Alias of tvm.tir.min_value()

tvm.te.max_value(dtype: str, span: Optional[tvm.ir.base.Span] = None) → Any¶

maximum value of dtype

Parameters

dtype (str) – The data type.
span (Optional[Span]) – The location of this operator in the source code.

Returns

value – The maximum value of dtype.

Return type

tvm.Expr

Alias of tvm.tir.max_value()

tvm.te.trace(args, trace_action='tvm.default_trace_action')¶

Trace tensor data at the runtime.

The trace function allows to trace specific tensor at the runtime. The tracing value should come as last argument. The trace action should be specified, by default tvm.default_trace_action is used.

Parameters

args (list of Expr or Buffers.) – Positional arguments.
trace_action (str.) – The name of the trace action.

Returns

call – The call expression.

Return type

PrimExpr

See also

tvm.tir.call_packed: Creates packed function.

Alias of tvm.tir.trace()

tvm.te.exp(x)¶

Take exponential of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.exp()

tvm.te.erf(x)¶

Take gauss error function of the input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.erf()

tvm.te.tanh(x)¶

Take hyperbolic tanh of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.tanh()

tvm.te.sigmoid(x)¶

Quick function to get sigmoid

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.sigmoid()

tvm.te.log(x)¶

Take log of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.log()

tvm.te.tan(x)¶

Take tan of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.tan()

tvm.te.cos(x)¶

Take cos of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.cos()

tvm.te.sin(x)¶

Take sin of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.sin()

tvm.te.sqrt(x)¶

Take square root of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.sqrt()

tvm.te.rsqrt(x)¶

Take reciprocal of square root of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.rsqrt()

tvm.te.floor(x: tvm.tir.expr.PrimExprWithOp, span=None)¶

Take floor of float input x.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.floor()

tvm.te.ceil(x, span=None)¶

Take ceil of float input x.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.ceil()

tvm.te.sinh(x)¶

Take sinh of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.sinh()

tvm.te.cosh(x)¶

Take cosh of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.cosh()

tvm.te.log2(x)¶

Take log2 of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.log2()

tvm.te.log10(x)¶

Take log10 of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.log10()

tvm.te.asin(x)¶

Take asin of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.asin()

tvm.te.asinh(x)¶

Take asinh of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.asinh()

tvm.te.acos(x)¶

Take acos of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.acos()

tvm.te.acosh(x)¶

Take acos of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.acosh()

tvm.te.atan(x)¶

Take atan of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.atan()

tvm.te.atanh(x)¶

Take atanh of input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.atanh()

tvm.te.trunc(x, span=None)¶

Get truncated value of the input.

The truncated value of the scalar x is the nearest integer i which is closer to zero than x is.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.trunc()

tvm.te.abs(x, span=None)¶

Get absolute value of the input element-wise.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.abs()

tvm.te.round(x, span=None)¶

Round elements of the array to the nearest integer.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.round()

tvm.te.nearbyint(x, span=None)¶

Round elements of the array to the nearest integer. This intrinsic uses llvm.nearbyint instead of llvm.round which is faster but will results different from te.round. Notably nearbyint rounds according to the rounding mode, whereas te.round (llvm.round) ignores that. For differences between the two see: https://en.cppreference.com/w/cpp/numeric/math/round https://en.cppreference.com/w/cpp/numeric/math/nearbyint

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.nearbyint()

tvm.te.power(x, y, span=None)¶

x power y

Parameters

x (PrimExpr) – Input argument.
y (PrimExpr) – The exponent
span (Optional[Span]) – The location of this operator in the source code.

Returns

z – The result.

Return type

PrimExpr

Alias of tvm.tir.power()

tvm.te.popcount(x)¶

Count the number of set bits in input x.

Parameters: x (PrimExpr) – Input argument.
Returns: y – The result.
Return type: PrimExpr

Alias of tvm.tir.popcount()

tvm.te.fmod(x, y)¶

Return the remainder of x divided by y with the same sign as x.

Parameters

x (PrimExpr) – Input argument.
y (PrimExpr) – Input argument.

Returns

z – The result.

Return type

PrimExpr

Alias of tvm.tir.fmod()

tvm.te.if_then_else(cond, t, f, span=None)¶

Conditional selection expression.

Parameters

cond (PrimExpr) – The condition
t (PrimExpr) – The result expression if cond is true.
f (PrimExpr) – The result expression if cond is false.
span (Optional[Span]) – The location of this operator in the source.

Returns

result – The result of conditional expression.

Return type

Node

Note

Unlike Select, if_then_else will not execute the branch that does not satisfy the condition. You can use it to guard against out of bound access. Unlike Select, if_then_else cannot be vectorized if some lanes in the vector have different conditions.

Alias of tvm.tir.if_then_else()

tvm.te.isnan(x, span=None)¶

Check if input value is Nan.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.isnan()

tvm.te.isfinite(x, span=None)¶

Check if input value is finite.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.isfinite()

tvm.te.isinf(x, span=None)¶

Check if input value is infinite.

Parameters

x (PrimExpr) – Input argument.
span (Optional[Span]) – The location of this operator in the source code.

Returns

y – The result.

Return type

PrimExpr

Alias of tvm.tir.isinf()

tvm.te.div(a, b, span=None)¶

Compute a / b as in C/C++ semantics.

Parameters

a (PrimExpr) – The left hand operand, known to be non-negative.
b (PrimExpr) – The right hand operand, known to be non-negative.
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Note

When operands are integers, returns truncdiv(a, b, span).

Alias of tvm.tir.div()

tvm.te.indexdiv(a, b, span=None)¶

Compute floor(a / b) where a and b are non-negative.

Parameters

a (PrimExpr) – The left hand operand, known to be non-negative.
b (PrimExpr) – The right hand operand, known to be non-negative.
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Note

Use this function to split non-negative indices. This function may take advantage of operands’ non-negativeness.

Alias of tvm.tir.indexdiv()

tvm.te.indexmod(a, b, span=None)¶

Compute the remainder of indexdiv. a and b are non-negative.

Parameters

a (PrimExpr) – The left hand operand, known to be non-negative.
b (PrimExpr) – The right hand operand, known to be non-negative.
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Note

Use this function to split non-negative indices. This function may take advantage of operands’ non-negativeness.

Alias of tvm.tir.indexmod()

tvm.te.truncdiv(a, b, span=None)¶

Compute the truncdiv of two expressions.

Parameters

a (PrimExpr) – The left hand operand
b (PrimExpr) – The right hand operand
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Note

This is the default integer division behavior in C.

Alias of tvm.tir.truncdiv()

tvm.te.truncmod(a, b, span=None)¶

Compute the truncmod of two expressions.

Parameters

a (PrimExpr) – The left hand operand
b (PrimExpr) – The right hand operand
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Note

This is the default integer division behavior in C.

Alias of tvm.tir.truncmod()

tvm.te.floordiv(a, b, span=None)¶

Compute the floordiv of two expressions.

Parameters

a (PrimExpr) – The left hand operand
b (PrimExpr) – The right hand operand
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Alias of tvm.tir.floordiv()

tvm.te.floormod(a, b, span=None)¶

Compute the floormod of two expressions.

Parameters

a (PrimExpr) – The left hand operand
b (PrimExpr) – The right hand operand
span (Optional[Span]) – The location of this operator in the source.

Returns

res – The result expression.

Return type

PrimExpr

Alias of tvm.tir.floormod()

tvm.te.comm_reducer(fcombine, fidentity, name='reduce')¶

Create a commutative reducer for reduction.

Parameters

fcombine (function(Expr -> Expr -> Expr)) – A binary function which takes two Expr as input to return a Expr.
fidentity (function(str -> Expr)) – A function which takes a type string as input to return a const Expr.

Returns

reducer – A function which creates a reduce expression over axis. There are two ways to use it:

accept (expr, axis, where) to produce an Reduce Expr on specified axis;
simply use it with multiple Exprs.

Return type

function

Example

n = te.var("n")
m = te.var("m")
mysum = te.comm_reducer(lambda x, y: x+y,
    lambda t: tvm.tir.const(0, dtype=t), name="mysum")
A = te.placeholder((n, m), name="A")
k = te.reduce_axis((0, m), name="k")
B = te.compute((n,), lambda i: mysum(A[i, k], axis=k), name="B")

Alias of tvm.tir.comm_reducer()

tvm.te.min(expr, axis, where=None, init=None, *args)¶

Create a min expression over axis.

Parameters

expr (PrimExpr) – The source expression.
axis (IterVar) – The reduction IterVar axis
where (optional, Expr) – Filtering predicate of the reduction.

Returns

value – The result value.

Return type

PrimExpr

Example

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this min reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.min represents tvm.te.min or tvm.tir.min.
B = te.compute((m,), lambda i: tvm.min(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
min_res = tvm.min(m, n)

Alias of tvm.tir.min()

tvm.te.max(expr, axis, where=None, init=None, *args)¶

Create a max expression over axis.

Parameters

expr (PrimExpr) – The source expression.
axis (IterVar) – The reduction IterVar axis
where (optional, Expr) – Filtering predicate of the reduction.

Returns

value – The result value.

Return type

PrimExpr

Example

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this max reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.max represents tvm.te.max or tvm.tir.max.
B = te.compute((m,), lambda i: tvm.max(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
max_res = tvm.max(m, n)

Alias of tvm.tir.max()

tvm.te.sum(expr, axis, where=None, init=None, *args)¶

Create a sum expression over axis.

Parameters

expr (PrimExpr) – The source expression.
axis (IterVar) – The reduction IterVar axis
where (optional, Expr) – Filtering predicate of the reduction.

Returns

value – The result value.

Return type

PrimExpr

Example

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this sum reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.sum represents tvm.te.sum or tvm.tir.sum.
B = te.compute((m,), lambda i: tvm.sum(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
sum_res = tvm.sum(m, n)

Alias of tvm.tir.sum()

tvm.te.add(lhs, rhs, span=None)¶

Generic add operator.

Parameters

lhs (object) – The left operand.
rhs (object) – The right operand.
span (Optional[Span]) – The location of this operator in the source.

Returns

op – The result Expr of add operaton.

Return type

tvm.Expr

Alias of tvm.tir.add()

tvm.te.subtract(lhs, rhs, span=None)¶

Generic subtract operator.

Parameters

lhs (object) – The left operand.
rhs (object) – The right operand.
span (Optional[Span]) – The location of this operator in the source.

Returns

op – The result Expr of subtract operaton.

Return type

tvm.Expr

Alias of tvm.tir.subtract()

tvm.te.multiply(lhs, rhs, span=None)¶

Generic multiply operator.

Parameters

lhs (object) – The left operand.
rhs (object) – The right operand.
span (Optional[Span]) – The location of this operator in the source.

Returns

op – The result Expr of multiply operaton.

Return type

tvm.Expr

Alias of tvm.tir.multiply()

class tvm.te.Schedule¶

Schedule for all the stages.

Methods:

`normalize`()	Build a normalized schedule from the current schedule.
`create_group`(outputs, inputs[, include_inputs])	Create stage group by giving output and input boundary.
`cache_read`(tensor, scope, readers)	Create a cache read of original tensor for readers.
`cache_write`(tensor, scope)	Create a cache write of original tensor, before storing into tensor.
`rfactor`(tensor, axis[, factor_axis])	Factor a reduction axis in tensor's schedule to be an explicit axis.

normalize()¶

Build a normalized schedule from the current schedule.

Insert necessary rebase to make certain iter var to start from 0. This is needed before bound inference and followup step.

Returns: sch – The normalized schedule.
Return type: Schedule

create_group(outputs, inputs, include_inputs=False)¶

Create stage group by giving output and input boundary.

The operators between outputs and inputs are placed as member of group. outputs are include in the group, while inputs are not included.

Parameters

outputs (list of Tensors) – The outputs of the group.
inputs (list of Tensors) – The inputs of the group.
include_inputs (boolean, optional) – Whether include input operations in the group if they are used by outputs.

Returns

group – A virtual stage represents the group, user can use compute_at to move the attachment point of the group.

Return type

Stage

cache_read(tensor, scope, readers)¶

Create a cache read of original tensor for readers.

This will mutate the body of the readers. A new cache stage will be created for the tensor. Call this before doing any split/fuse schedule.

Parameters

tensor (Tensor) – The tensor to be cached.
scope (str) – The scope of cached
readers (list of Tensor or Operation) – The readers to read the cache.

Returns

cache – The created cache tensor.

Return type

Tensor

cache_write(tensor, scope)¶

Create a cache write of original tensor, before storing into tensor.

This will mutate the body of the tensor. A new cache stage will created before feed into the tensor.

This function can be used to support data layout transformation. If there is a split/fuse/reorder on the data parallel axis of tensor before cache_write is called. The intermediate cache stores the data in the layout as the iteration order of leave axis. The data will be transformed back to the original layout in the original tensor. User can further call compute_inline to inline the original layout and keep the data stored in the transformed layout.

Parameters

tensor (Tensor, list or tuple) – The tensors to be feed to. All the tensors must be produced by one computeOp
scope (str) – The scope of cached

Returns

cache – The created cache tensor.

Return type

Tensor

rfactor(tensor, axis, factor_axis=0)¶

Factor a reduction axis in tensor’s schedule to be an explicit axis.

This will create a new stage that generated the new tensor with axis as the first dimension. The tensor’s body will be rewritten as a reduction over the factored tensor.

Parameters

tensor (Tensor) – The tensor to be factored.
axis (IterVar) – The reduction axis in the schedule to be factored.
factor_axis (int) – The position where the new axis is placed.

Returns

tfactor – The created factored tensor.

Return type

Tensor or Array of Tensor

class tvm.te.Stage¶

A Stage represents schedule for one operation.

Methods:

`split`(parent[, factor, nparts])	Split the stage either by factor providing outer scope, or both
`fuse`(*args)	Fuse multiple consecutive iteration variables into a single iteration variable.
`set_scope`(scope)	Set the thread scope of this stage
`bind`(ivar, thread_ivar)	Bind ivar to thread index thread_ivar
`env_threads`(threads)	Mark threads to be launched at the outer scope of composed op.
`set_store_predicate`(predicate)	Set predicate under which store to the array can be performed.
`compute_at`(parent, scope)	Attach the stage at parent's scope
`compute_inline`()	Mark stage as inline
`compute_root`()	Attach the stage at parent, and mark it as root
`reorder`(*args)	reorder the arguments in the specified order.
`tile`(x_parent, y_parent, x_factor, y_factor)	Perform tiling on two dimensions
`vectorize`(var)	Vectorize the iteration.
`tensorize`(var, tensor_intrin)	Tensorize the computation enclosed by var with tensor_intrin
`unroll`(var)	Unroll the iteration.
`parallel`(var)	Parallelize the iteration.
`pragma`(var, pragma_type[, pragma_value])	Annotate the iteration with pragma
`prefetch`(tensor, var, offset)	Prefetch the specified variable
`storage_align`(axis, factor, offset)	Set alignment requirement for specific axis
`double_buffer`()	Compute the current stage via double buffering.
`rolling_buffer`()	Compute the current stage via rolling buffering.
`transform_layout`(mapping_function)	Defines the layout transformation for the current stage's tensor.

split(parent, factor=None, nparts=None)¶

Split the stage either by factor providing outer scope, or both

Parameters

parent (IterVar) – The parent iter var.
factor (Expr, optional) – The splitting factor
nparts (Expr, optional) – The number of outer parts.

Returns

outer (IterVar) – The outer variable of iteration.
inner (IterVar) – The inner variable of iteration.

fuse(*args)¶

Fuse multiple consecutive iteration variables into a single iteration variable.

fused = fuse(…fuse(fuse(args[0], args[1]), args[2]),…, args[-1]) The order is from outer to inner.

Parameters: args (list of IterVars) – Itervars that proceeds each other
Returns: fused – The fused variable of iteration.
Return type: IterVar

set_scope(scope)¶

Set the thread scope of this stage

Parameters: scope (str) – The thread scope of this stage

bind(ivar, thread_ivar)¶

Bind ivar to thread index thread_ivar

Parameters

ivar (IterVar) – The iteration to be binded to thread.
thread_ivar (IterVar) – The thread to be binded.

env_threads(threads)¶

Mark threads to be launched at the outer scope of composed op.

Parameters: threads (list of threads) – The threads to be launched.

set_store_predicate(predicate)¶

Set predicate under which store to the array can be performed.

Use this when there are duplicated threads doing the same store and we only need one of them to do the store.

Parameters: predicate (Expr) – The guard condition fo store.

compute_at(parent, scope)¶

Attach the stage at parent’s scope

Parameters

parent (Stage) – The parent stage
scope (IterVar) – The loop scope t be attached to.

compute_inline()¶

Mark stage as inline

Parameters: parent (Stage) – The parent stage

compute_root()¶

Attach the stage at parent, and mark it as root

Parameters: parent (Stage) – The parent stage

reorder(*args)¶

reorder the arguments in the specified order.

Parameters: args (list of IterVar) – The order to be ordered

tile(x_parent, y_parent, x_factor, y_factor)¶

Perform tiling on two dimensions

The final loop order from outmost to inner most are [x_outer, y_outer, x_inner, y_inner]

Parameters

x_parent (IterVar) – The original x dimension
y_parent (IterVar) – The original y dimension
x_factor (Expr) – The stride factor on x axis
y_factor (Expr) – The stride factor on y axis

Returns

x_outer (IterVar) – Outer axis of x dimension
y_outer (IterVar) – Outer axis of y dimension
x_inner (IterVar) – Inner axis of x dimension
p_y_inner (IterVar) – Inner axis of y dimension

vectorize(var)¶

Vectorize the iteration.

Parameters: var (IterVar) – The iteration to be vectorize

tensorize(var, tensor_intrin)¶

Tensorize the computation enclosed by var with tensor_intrin

Parameters

var (IterVar) – The iteration boundary of tensorization.
tensor_intrin (TensorIntrin) – The tensor intrinsic used for computation.

unroll(var)¶

Unroll the iteration.

Parameters: var (IterVar) – The iteration to be unrolled.

parallel(var)¶

Parallelize the iteration.

Parameters: var (IterVar) – The iteration to be parallelized.

pragma(var, pragma_type, pragma_value=None)¶

Annotate the iteration with pragma

This will translate to a pragma_scope surrounding the corresponding loop generated. Useful to support experimental features and extensions.

Parameters

var (IterVar) – The iteration to be anotated
pragma_type (str) – The pragma string to be annotated
pragma_value (Expr, optional) – The pragma value to pass along the pragma

Note

Most pragmas are advanced/experimental features and may subject to change. List of supported pragmas:

debug_skip_region

Force skip the region marked by the axis and turn it into no-op. This is useful for debug purposes.
parallel_launch_point

Specify to launch parallel threads outside the specified iteration loop. By default the threads launch at the point of parallel construct. This pragma moves the launching point to even outer scope. The threads are launched once and reused across multiple parallel constructs as BSP style program.
parallel_barrier_when_finish

Insert a synchronization barrier between working threads after the specified loop iteration finishes.
parallel_stride_pattern

Hint parallel loop to execute in strided pattern. for (int i = task_id; i < end; i += num_task)

prefetch(tensor, var, offset)¶

Prefetch the specified variable

Parameters

tensor (Tensor) – The tensor to be prefetched
var (IterVar) – The loop point at which the prefetching is applied
offset (Expr) – The number of iterations to be prefetched before actual execution

storage_align(axis, factor, offset)¶

Set alignment requirement for specific axis

This ensures that stride[axis] == k * factor + offset for some k. This is useful to set memory layout to for more friendly memory access pattern. For example, we can set alignment to be factor=2, offset=1 to avoid bank conflict for thread access on higher dimension in GPU shared memory.

Parameters

axis (IterVar) – The axis dimension to be aligned.
factor (int) – The factor in alignment specification.
offset (int) – The offset in the alignment specification.

double_buffer()¶

Compute the current stage via double buffering.

This can only be applied to intermediate stage. This will double the storage cost of the current stage. Can be useful to hide load latency.

rolling_buffer()¶

Compute the current stage via rolling buffering.

This can only be applied to intermediate stage. This will change the storage cost of the current stage.

transform_layout(mapping_function: Callable[[...], List[tvm.ir.expr.PrimExpr]])¶

Defines the layout transformation for the current stage’s tensor.

The map from initial_indices to final_indices must be an invertible affine transformation. This method may be called more than once for a given tensor, in which case each transformation is applied sequentially.

If the stage is a ComputeOp, then the iteration order of the compute stage is rewritten to be a row-major traversal of the tensor, and the new loop iteration variables are returned. For all other stages, the loop iteration order is unmodified, and the return value is None.

Parameters

mapping_function (Callable[..., List[tvm.tir.PrimExpr]]) – A callable that accepts N arguments of type tvm.tir.Var, and outputs a list of PrimExpr. The input arguments represent the location of a value in the current stage’s tensor, using the pre-transformation layout. The return value of the function gives the location of that value in the current stage’s tensor, using the post-transformation layout.

Returns

new_iter_vars – If the stage is a ComputeOp, then the return will be the updated loop iteration variables over the data array, in the same order as the output values from the mapping_function.

Otherwise, the return value is None.

Return type

Optional[List[tvm.tir.IterVar]]

Examples

# ``A`` is a tensor whose compute definition is in NHWC
# format, and should be transformed into NCHWc format.

s[A].transform_layout(
    lambda n,h,w,c: [n, c//4, h, w, c%4]
)

# ``A`` is a tensor whose compute definition is in an
# arbitrary format, and should be transformed such that
# the last index is split, with the slower-changing index
# of the split placed at the slowest changing dimension.

s[A].transform_layout(
    lambda *indices, i: [i//4, *indices, i%4]
)

# ``B`` is a tensor defined by te.compute to be a copy of
# ``A`, and should be transformed such that ``B``'s layout
# is a transpose of ``A``'s layout.  The loop iteration
# that computes ``B`` will correspond to ``B``'s memory
# layout.

A = te.placeholder([n,m])
B = te.compute(A.shape, lambda i,j: A[i,j])
s = te.create_schedule(B.op)

s[B].transform_layout(lambda i,j: [j,i])

tvm.te.create_schedule(ops)¶

Create a schedule for list of ops

Parameters: ops (list of Operations) – The source expression.
Returns: sch – The created schedule.
Return type: schedule.Schedule

class tvm.te.SpecializedCondition(conditions)¶

Specialized condition to enable op specialization.

Methods:

current()

Returns the current specialized condition

static current()¶: Returns the current specialized condition

class tvm.te.TensorSlice(tensor, indices)¶

Auxiliary data structure for enable slicing syntax from tensor.

Methods:

asobject()

Convert slice to object.

Attributes:

dtype

Data content of the tensor.

asobject()¶: Convert slice to object.

property dtype¶: Data content of the tensor.

class tvm.te.Tensor¶

Tensor object, to construct, see function.Tensor

Attributes:

`ndim`	Dimension of the tensor.
`axis`	Axis of the tensor.
`op`	The corressponding `Operation`.
`value_index`	The output value index the tensor corresponds to.
`shape`	The output shape of the tensor.

property ndim¶: Dimension of the tensor.

property axis¶: Axis of the tensor.

property op¶: The corressponding Operation.

property value_index¶: The output value index the tensor corresponds to.

property shape¶: The output shape of the tensor.

tvm.te.decl_tensor_intrin(op, fcompute, name='tensor_intrin', binds=None, scalar_params=None, default_buffer_params=None)¶

Declare a tensor intrinsic function.

Parameters

op (Operation) – The symbolic description of the intrinsic operation
fcompute (lambda function of inputs, outputs-> stmt) –
Specifies the IR statement to do the computation. See the following note for function signature of fcompute
Note

Parameters
- ins (list of tvm.tir.Buffer) - Placeholder for each inputs
- outs (list of tvm.tir.Buffer) - Placeholder for each outputs
Returns
- stmt (tvm.tir.Stmt, or tuple of three stmts)
- If a single stmt is returned, it represents the body
- If tuple of three stmts are returned they corresponds to body, reduce_init, reduce_update
name (str, optional) – The name of the intrinsic.
binds (dict of Tensor to tvm.tir.Buffer, optional) – Dictionary that maps the Tensor to Buffer which specified the data layout requirement of the function. By default, a new compact buffer is created for each tensor in the argument.
scalar_params (a list of variables used by op, whose values will be passed) – as scalar_inputs when the tensor intrinsic is called.
default_buffer_params (Optional[dict]) – Dictionary of buffer arguments to be passed when constructing a buffer.

Returns

intrin – A TensorIntrin that can be used in tensorize schedule.

Return type

TensorIntrin

tvm.te.tag_scope(tag)¶

The operator tag scope.

Parameters: tag (str) – The tag name.
Returns: tag_scope – The tag scope object, which can be used as decorator or context manger.
Return type: TagScope

Example

n = te.var('n')
m = te.var('m')
l = te.var('l')
A = te.placeholder((n, l), name='A')
B = te.placeholder((m, l), name='B')
k = te.reduce_axis((0, l), name='k')

with tvm.te.tag_scope(tag='matmul'):
    C = te.compute((n, m), lambda i, j: te.sum(A[i, k] * B[j, k], axis=k))

# or use tag_scope as decorator
@tvm.te.tag_scope(tag="conv")
def compute_relu(data):
    return te.compute(data.shape, lambda *i: tvm.tir.Select(data(*i) < 0, 0.0, data(*i)))

tvm.te.placeholder(shape, dtype=None, name='placeholder')¶

Construct an empty tensor object.

Parameters

shape (Tuple of Expr) – The shape of the tensor
dtype (str, optional) – The data type of the tensor
name (str, optional) – The name hint of the tensor

Returns

tensor – The created tensor

Return type

Tensor

tvm.te.compute(shape, fcompute, name='compute', tag='', attrs=None, varargs_names=None)¶

Construct a new tensor by computing over the shape domain.

The compute rule is result[axis] = fcompute(axis)

Parameters

shape (Tuple of Expr) – The shape of the tensor
fcompute (lambda function of indices-> value) – Specifies the input source expression
name (str, optional) – The name hint of the tensor
tag (str, optional) – Additional tag information about the compute.
attrs (dict, optional) – The additional auxiliary attributes about the compute.
varargs_names (list, optional) – The names to use for each of the varargs. If not supplied, the varargs will be called i1, i2, …

Returns

tensor – The created tensor

Return type

Tensor

tvm.te.scan(init, update, state_placeholder, inputs=None, name='scan', tag='', attrs=None)¶

Construct new tensors by scanning over axis.

Parameters

init (Tensor or list of Tensor) – The initial condition of first init.shape[0] timestamps
update (Tensor or list of Tensor) – The update rule of the scan given by symbolic tensor.
state_placeholder (Tensor or list of Tensor) – The placeholder variables used by update.
inputs (Tensor or list of Tensor, optional) – The list of inputs to the scan. This is not required, but can be useful for the compiler to detect scan body faster.
name (str, optional) – The name hint of the tensor
tag (str, optional) – Additonal tag information about the compute.
attrs (dict, optional) – The additional auxiliary attributes about the compute.

Returns

tensor – The created tensor or tuple of tensors contains multiple outputs.

Return type

Tensor or list of Tensors

Example

# The following code is equivalent to numpy.cumsum
m = te.var("m")
n = te.var("n")
X = te.placeholder((m, n), name="X")
s_state = te.placeholder((m, n))
s_init = te.compute((1, n), lambda _, i: X[0, i])
s_update = te.compute((m, n), lambda t, i: s_state[t-1, i] + X[t, i])
res = tvm.te.scan(s_init, s_update, s_state, X)

tvm.te.extern(shape, inputs, fcompute, name='extern', dtype=None, in_buffers=None, out_buffers=None, tag='', attrs=None)¶

Compute several tensors via an extern function.

Parameters

shape (tuple or list of tuples.) – The shape of the outputs.
inputs (list of Tensor) – The inputs
fcompute (lambda function of inputs, outputs-> stmt) –
Specifies the IR statement to do the computation. See the following note for function signature of fcompute
Note

Parameters
- ins (list of tvm.tir.Buffer) - Placeholder for each inputs
- outs (list of tvm.tir.Buffer) - Placeholder for each outputs
Returns
- stmt (tvm.tir.Stmt) - The statement that carries out array computation.
name (str, optional) – The name hint of the tensor
dtype (str or list of str, optional) – The data types of outputs, by default dtype will be same as inputs.
in_buffers (tvm.tir.Buffer or list of tvm.tir.Buffer, optional) – Input buffers.
out_buffers (tvm.tir.Buffer or list of tvm.tir.Buffer, optional) – Output buffers.

tag: str, optional: Additonal tag information about the compute.
attrs: dict, optional: The additional auxiliary attributes about the compute.

Returns: tensor – The created tensor or tuple of tensors contains multiple outputs.
Return type: Tensor or list of Tensors

Example

In the code below, C is generated by calling external PackedFunc tvm.contrib.cblas.matmul

A = te.placeholder((n, l), name="A")
B = te.placeholder((l, m), name="B")
C = te.extern((n, m), [A, B],
               lambda ins, outs: tvm.tir.call_packed(
                  "tvm.contrib.cblas.matmul",
                    ins[0], ins[1], outs[0], 0, 0), name="C")

tvm.te.var(name='tindex', dtype='int32', span=None)¶

Create a new variable with specified name and dtype

Parameters

name (str) – The name
dtype (str) – The data type
span (Optional[Span]) – The location of this variable in the source.

Returns

var – The result symbolic variable.

Return type

Var

tvm.te.size_var(name='size', dtype='int32', span=None)¶

Create a new variable represents a tensor shape size, which is non-negative.

Parameters

name (str) – The name
dtype (str) – The data type
span (Optional[Span]) – The location of this variable in the source.

Returns

var – The result symbolic shape variable.

Return type

SizeVar

tvm.te.const(value, dtype='int32', span=None)¶

Create a new constant with specified value and dtype

Parameters

value (Union[bool, int, float, numpy.ndarray, tvm.nd.NDArray]) – The constant value.
dtype (str) – The data type
span (Optional[Span]) – The location of this variable in the source.

Returns

const – The result constant expr.

Return type

PrimExpr

tvm.te.thread_axis(dom=None, tag='', name='', span=None)¶

Create a new IterVar to represent thread index.

Parameters

dom (Range or str) – The domain of iteration When str is passed, dom is set to None and str is used as tag
tag (str, optional) – The thread tag
name (str, optional) – The name of the var.
span (Optional[Span]) – The location of this variable in the source.

Returns

axis – The thread itervar.

Return type

IterVar

tvm.te.reduce_axis(dom, name='rv', thread_tag='', span=None)¶

Create a new IterVar for reduction.

Parameters

dom (Range) – The domain of iteration.
name (str) – The name of the variable.
thread_tag (Optional[str]) – The name of the thread_tag.
span (Optional[Span]) – The location of this variable in the source.

Returns

axis – An iteration variable representing the value.

Return type

IterVar

tvm.te.create_prim_func(ops: List[tvm.te.tensor.Tensor], index_dtype_override: Optional[str] = None) → tvm.tir.function.PrimFunc¶

Create a TensorIR PrimFunc from tensor expression

Parameters: ops (List[Tensor]) – The source expression.

Example

We define a matmul kernel using following code:

import tvm
from tvm import te
from tvm.te import create_prim_func
import tvm.script

A = te.placeholder((128, 128), name="A")
B = te.placeholder((128, 128), name="B")
k = te.reduce_axis((0, 128), "k")
C = te.compute((128, 128), lambda x, y: te.sum(A[x, k] * B[y, k], axis=k), name="C")
func = create_prim_func([A, B, C])
print(func.script())

If we want to use TensorIR schedule to do transformations on such kernel, we need to use create_prim_func([A, B, C]) to create a schedulable PrimFunc. The generated function looks like:

@T.prim_func
def tir_matmul(a: T.handle, b: T.handle, c: T.handle) -> None:
    A = T.match_buffer(a, (128, 128))
    B = T.match_buffer(b, (128, 128))
    C = T.match_buffer(c, (128, 128))

    for i, j, k in T.grid(128, 128, 128):
        with T.block():
            vi, vj, vk = T.axis.remap("SSR", [i, j, k])
            with T.init():
                C[vi, vj] = 0.0
            C[vi, vj] += A[vi, vk] * B[vj, vk]

Returns: func – The created function.
Return type: tir.PrimFunc

tvm.te.extern_primfunc(input_tensors: List[tvm.te.tensor.Tensor], primfunc: tvm.tir.function.PrimFunc, **kwargs)¶

Compute tensors via a schedulable TIR PrimFunc

Parameters

input_tensors (list of Tensor) – Input tensors that map to the corresponding primfunc input params.
primfunc (PrimFunc) – The TIR PrimFunc

Returns

tensor – The created tensor or tuple of tensors if it contains multiple outputs.

Return type

Tensor or list of Tensors

Example

In the code below, a TVMScript defined TIR PrimFunc is inlined into a TE ExternOp. Applying te.create_prim_func on this

A = te.placeholder((128, 128), name="A")
B = te.placeholder((128, 128), name="B")

@T.prim_func
def before_split(a: T.handle, b: T.handle) -> None:
    A = T.match_buffer(a, (128, 128))
    B = T.match_buffer(b, (128, 128))
    for i, j in T.grid(128, 128):
        with T.block("B"):
            vi, vj = T.axis.remap("SS", [i, j])
            B[vi, vj] = A[vi, vj] * 2.0

C = te.extern_primfunc([A, B], func)

class tvm.te.PlaceholderOp¶: Placeholder operation.

class tvm.te.ComputeOp¶: Scalar operation.

class tvm.te.TensorComputeOp¶: Tensor operation.

class tvm.te.ScanOp¶

Scan operation.

Attributes:

scan_axis

Represent the scan axis, only defined when it is a ScanOp

property scan_axis¶: Represent the scan axis, only defined when it is a ScanOp

class tvm.te.ExternOp¶: External operation.

class tvm.te.HybridOp¶

Hybrid operation.

Attributes:

axis

Represent the IterVar axis, also defined when it is a HybridOp

property axis¶: Represent the IterVar axis, also defined when it is a HybridOp

tvm.te.gradient(output, inputs, head=None)¶

Perform reverse-mode automatic differentiation.

Parameters

output (Tensor) – The tensor to differentiate.
inputs (List[Tensor]) – The list of input tensors to be differentiated wrt.
head (Tensor) – The adjoint of the output, in other words, some tensor, by which the Jacobians will be multiplied. Its shape must be of the form prefix + output.shape. If None is passed, the identity tensor of shape output.shape + output.shape will be used.

Returns

tensors – The result gradient, in the same order as the inputs

Return type

List[Tensor]

Example

x = tvm.placeholder((32, 3, 28, 28), name='x')
w1 = tvm.placeholder((10, 3, 3, 3), name='w1')
w2 = tvm.placeholder((10, 10, 3, 3), name='w2')
z1 = topi.nn.conv2d(x, w1, 1, 1, 1)
z2 = topi.nn.conv2d(z1, w2, 1, 1, 1)
y = topi.sum(z2)

# produce gradients
[dw1, dw2] = tvm.gradient(y, [w1, w2])

# produce Jacobians
[jw1, jw2] = tvm.gradient(z2, [w1, w2])

# produce gradients, the head adjoint for z2 is provided manually
[dw1, dw2] = tvm.gradient(z2, [w1, w2], topi.full_like(z2, 1.0))

tvm.te.hybrid¶

Hybrid Programming APIs of TVM Python Package.

This package maps a subset of python to HalideIR so that: 1. Users can write some preliminary versions of the computation patterns have not been supported yet and verify it across the real execution and python semantic emulation. 2. So far, it is a text format dedicated to HalideIR Phase 0. Refer tvm.lower for more details. A larger ambition of this module is to support all levels of HalideIR.

Functions:

`decorate`(func, fwrapped)	A wrapper call of decorator package, differs to call time
`source_to_op`(src, args, symbols, closure_vars)	Another level of wrapper
`script`(pyfunc)	Decorate a python function as hybrid script.
`build`(sch, inputs, outputs[, name])	Dump the current schedule to hybrid module

Classes:

HybridModule([src, name])

The usage of Hybrid Module is very similar to conventional TVM module, but conventional TVM module requires a function body which is already fully lowered.

tvm.te.hybrid.decorate(func, fwrapped)¶

A wrapper call of decorator package, differs to call time

Parameters

func (function) – The original function
fwrapped (function) – The wrapped function

class tvm.te.hybrid.HybridModule(src=None, name=None)¶

The usage of Hybrid Module is very similar to conventional TVM module, but conventional TVM module requires a function body which is already fully lowered. This contradicts to the fact that Hybrid Module is originally a text format for Phase 0 HalideIR. Thus, a totally separated module is defined.

Methods:

load(path)

Load the module from a python file

load(path)¶

Load the module from a python file

Parameters: path (str) – Path to the given python file

tvm.te.hybrid.source_to_op(src, args, symbols, closure_vars)¶

Another level of wrapper

Parameters

src (ast.node or str) – If an ast.node, then directly lower it. If a str, then parse it to ast and lower it.
args (list of Tensors or Vars) – The argument lists to the function. It is NOT encouraged to write a function without arguments. It is NOT encouraged to write a function with side effect.
symbols (list of str) – The symbol list of the global context of the function.
closure_vars (dict) – A dict of external name reference captured by this function.

Returns

res – The result of output tensors of the formed OpNode.

Return type

list of output tensors

tvm.te.hybrid.script(pyfunc)¶

Decorate a python function as hybrid script.

The hybrid function support emulation mode and parsing to the internal language IR.

Returns: hybrid_func – A decorated hybrid script function.
Return type: function

tvm.te.hybrid.build(sch, inputs, outputs, name='hybrid_func')¶

Dump the current schedule to hybrid module

Parameters

sch (tvm.te.Schedule) – The schedule to be dumped
inputs (An array of Tensors or Vars) – The inputs of the function body
outputs (An array of Tensors) – The outputs of the function body

Returns

module – The built results is wrapped in a HybridModule. The usage of HybridModule is roughly the same as normal TVM-built modules.

Return type

HybridModule