tvm
Namespaces | Functions
transform.h File Reference

S-TIR specific transformation passes. More...

#include <tvm/ir/transform.h>
#include <tvm/target/target.h>
#include <tvm/tirx/transform.h>
#include <string>
#include <vector>
Include dependency graph for transform.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces

 tvm
 An object that builds and maintains block scope and StmtSref mapping for Dependence analysis.
 
 tvm::s_tir
 
 tvm::s_tir::transform
 

Functions

tirx::PrimFunc tvm::s_tir::RenewDefs (const tirx::PrimFunc &func)
 Renew the definition nodes for a TIR, including Var, Buffer and IterVar. This pass works as a simple DeepCopy to duplicate a function with different Vars and Buffers but the same behavior. More...
 
Pass tvm::s_tir::transform::CanonicalizeLoop ()
 Canonicalize loop to start from zero . More...
 
Pass tvm::s_tir::transform::LowerCrossThreadReduction ()
 Lower cross-thread reduction from thread bindings to intrinsic function calls. More...
 
Pass tvm::s_tir::transform::LowerInitBlock ()
 Lower block init stmt into IfThenElse stmts. More...
 
Pass tvm::s_tir::transform::PlanAndUpdateBufferAllocationLocation ()
 Locate the buffer allocation to the exact position (usually is the lca of buffer access). This pass will inject opaque block with alloc_buffers at the allocation site. More...
 
Pass tvm::s_tir::transform::ConvertBlocksToOpaque ()
 Substitute all the block vars with the PrimExprs they are bound to, indicated by the corresponding iter_values in BlockRealize, for opaque blocks by removing all . the iter_values in BlockRealize and iter_vars in Block. More...
 
Pass tvm::s_tir::transform::LiftThreadBinding ()
 Lift the same thread bindings to their LCA loops. More...
 
Pass tvm::s_tir::transform::CompactBufferAllocation (bool is_strict=true)
 Compact the buffer access region by removing the buffer regions that are not accessed, i.e. narrowing the buffer shape and adjust the access region if necessary. More...
 
Pass tvm::s_tir::transform::LowerMatchBuffer ()
 Remove match buffers inside the block. Also, it will validate the binding. More...
 
Pass tvm::s_tir::transform::InjectPermutedLayout ()
 Inject permuted layout for shared memory. More...
 
Pass tvm::s_tir::transform::TransformMmaBufferLayout ()
 Transform Mma scope (m16n8k8.matrixA/B/C) to local scope with layout transformation. More...
 
Pass tvm::s_tir::transform::LowerOpaqueBlock ()
 Remove the block to ensure that the TIR can not be scheduled again. More...
 
Pass tvm::s_tir::transform::UnifyThreadBinding ()
 Unify all the thread bindings for "blockIdx.x/y/z", "threadIdx.x/y/z", and "vthread.x/y/z". Before the unification, two vars that are bound to a thread axis (e.g., "threadIdx.x") use different IterVars and variables in their AttrStmts. After the unification, we use a consolidated IterVar and a variable for them. More...
 
Pass tvm::s_tir::transform::InjectSoftwarePipeline ()
 This pass transforms annotated loops into pipelined ones where producers and consumers are overlapped with the information provided in loop annotations, which enables optimization techniques like prefetching and pipeline parallelism. More...
 
Pass tvm::s_tir::transform::LowerAutoCopy ()
 Automatically do memory optimizations for auto copy blocks. More...
 
Pass tvm::s_tir::transform::ManifestSharedMemoryLocalStage ()
 Add the explicit local stage for the shared memory access on GPU. More...
 
Pass tvm::s_tir::transform::AnnotateIrregularLoop ()
 Annotate irregular loop mark. More...
 
Pass tvm::s_tir::transform::LoopPartition ()
 partition loops in the stmt. More...
 
Pass tvm::s_tir::transform::InjectVirtualThread ()
 Inject virtual thread loops. More...
 
Pass tvm::s_tir::transform::InjectDoubleBuffer ()
 Inject double buffer statements. More...
 
Pass tvm::s_tir::transform::HoistIfThenElse (tvm::ffi::String variant="")
 Hoist loop-invariant IfThenElse nodes to outside the eligible loops. More...
 
Pass tvm::s_tir::transform::HoistExpression ()
 Hoist loop-invariant expressions to outside the eligible loops. More...
 
Pass tvm::s_tir::transform::RenormalizeSplitPattern ()
 Renormalize the split pattern from floordiv(floormod()) to floormod(floordiv()). More...
 
Pass tvm::s_tir::transform::RewriteUnsafeSelect ()
 Detect and rewrite unsafe select that contains memory access. More...
 
Pass tvm::s_tir::transform::InstrumentBoundCheckers ()
 Instruments bound checkers. More...
 
Pass tvm::s_tir::transform::InjectPTXLDG32 (bool enable_inject=true)
 Rewrite global to local memory copy on CUDA with ldg32 instruction. More...
 
Pass tvm::s_tir::transform::InstrumentProfileIntrinsics ()
 Insert intrinsic calls to instrument function and loop level profiling. More...
 
Pass tvm::s_tir::transform::LowerVtcmAlloc ()
 Lower VTCM allocations. More...
 
Pass tvm::s_tir::transform::ThreadSync (tvm::ffi::String storage_scope)
 Insert sync between parallel read/write of shared buffers. More...
 
Pass tvm::s_tir::transform::InferFragment ()
 Infer the TensorCore fragment information using tensor intrinsics. More...
 
Pass tvm::s_tir::transform::LowerThreadAllreduce ()
 Lower cross thread allreduce. More...
 
Pass tvm::s_tir::transform::LowerAsyncDMA ()
 Lower Async TIR primitives to DMA copy and wait builtins. More...
 
Pass tvm::s_tir::transform::InjectPTXAsyncCopy ()
 Rewrite global to shared memory copy on CUDA with asynchronous copy. More...
 
Pass tvm::s_tir::transform::MergeSharedMemoryAllocations ()
 Merge multiple TIR-level shared memory allocations into one. More...
 
Pass tvm::s_tir::transform::DefaultGPUSchedule ()
 Set default thread bindings for GPU PrimFuncs. More...
 
Pass tvm::s_tir::transform::RemoveWeightLayoutRewriteBlock (bool skip_tensor_rewrite=false)
 Remove weight layout rewrite block before benchmark. More...
 
Pass tvm::s_tir::transform::RemoveStoreUndef ()
 Remove stores of tirx::builtin::undef. More...
 
Pass tvm::s_tir::transform::DecorateDeviceScope ()
 Decorate all the function's body as device function. More...
 
Pass tvm::s_tir::transform::UseAssumeToReduceBranches ()
 Eliminate branches by leveraging buffer assumptions (T.assume). More...
 

Detailed Description

S-TIR specific transformation passes.