tvm_ffi.cpp.nvrtc.nvrtc_compile

Contents

tvm_ffi.cpp.nvrtc.nvrtc_compile#

tvm_ffi.cpp.nvrtc.nvrtc_compile(source, *, name='kernel.cu', arch=None, extra_opts=None)[source]#

Compile CUDA source code to CUBIN using NVRTC.

This function uses the NVIDIA Runtime Compilation (NVRTC) library to compile CUDA C++ source code into a CUBIN binary that can be loaded and executed using the CUDA Driver API.

Parameters:
  • source (str) – The CUDA C++ source code to compile.

  • name (str, optional) – The name to use for the source file (for error messages). Default: “kernel.cu”

  • arch (str, optional) – The target GPU architecture (e.g., “sm_75”, “sm_80”, “sm_89”). If not specified, attempts to auto-detect from the current GPU.

  • extra_opts (Sequence[str], optional) – Additional compilation options to pass to NVRTC (e.g., [“-I/path/to/include”, “-DDEFINE=1”]).

Return type:

bytes

Returns:

bytes – The compiled CUBIN binary data.

Raises:

RuntimeError – If NVRTC compilation fails or CUDA bindings are not available.

Example

from tvm_ffi.cpp import nvrtc

cuda_source = '''
extern "C" __global__ void add_one(float* x, float* y, int n) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if (idx < n) {
        y[idx] = x[idx] + 1.0f;
    }
}
'''

cubin_bytes = nvrtc.nvrtc_compile(cuda_source)
# Use cubin_bytes with tvm_ffi.cpp.load_inline and embed_cubin parameter