Installation
There are two pieces:
the TIRx compiler (
tvm.tirx), which ships inside Apache TVM — this is all you need to write and compile kernels;the optional kernel library (
tirx-kernels), a set of ready-made GEMM and attention kernels built with TIRx.
Requirements
Python ≥ 3.10.
An NVIDIA GPU with a recent CUDA toolkit. The bundled kernels target Blackwell (
sm_100a); the compiler itself targets GPUs and accelerators more broadly.
Install the TIRx compiler
Install the Apache TVM wheel (the TIRx compiler is the tvm.tirx module):
pip install apache-tvm==0.25.0
Verify:
python -c "import tvm, tvm.tirx; print(tvm.__version__)"
Install the kernel library (optional)
tirx-kernels provides prebuilt kernels (fp16_bf16_gemm,
fp8_blockwise_gemm, nvfp4_gemm, flash_attention4). It has no PyPI
wheel — install it from source:
git clone https://github.com/mlc-ai/tirx-kernels
cd tirx-kernels
pip install -e .
Its runtime dependencies are not pulled from PyPI and must be available
separately (they are imported lazily, so import tirx_kernels and kernel
discovery work without them — they are only needed to actually compile/run a
kernel):
Dependency |
Needed by |
Notes |
|---|---|---|
|
all kernels |
the TIRx compiler (installed above, or put a source checkout’s
|
|
all kernels |
a CUDA build matching your GPU |
|
|
optional — quantization helpers and the reference baseline |
|
|
optional — quantization and the baseline |
Build from source
To develop TIRx or build the docs, build TVM from source and make it importable. See Install from Source for the full instructions; in short:
export TVM_HOME=/path/to/tvm
export TVM_LIBRARY_PATH=$TVM_HOME/build
export PYTHONPATH=$TVM_HOME/python:$PYTHONPATH
python -c "import tvm, tvm.tirx; print(tvm.__file__)"