Define TVM_FFI_EMBED_CUBIN

Define TVM_FFI_EMBED_CUBIN#

Define Documentation#

TVM_FFI_EMBED_CUBIN(name)#

Macro to embed a CUBIN module with static initialization.

This macro declares external symbols for embedded CUBIN data and creates a singleton struct to manage the CubinModule instance. The CUBIN data symbols should be named __tvm_ffi__cubin_<name> and __tvm_ffi__cubin_<name>_end, typically created using objcopy and ld.

The utilities automatically handle:

  • Symbol renaming to __tvm_ffi__cubin_<name> format

  • Adding .note.GNU-stack section for security

  • Symbol localization to prevent conflicts

Creating Embedded CUBIN with TVM-FFI Utilities

TVM-FFI provides utilities to simplify CUBIN embedding. You have two options:

Option 1: CMake Utility (Recommended)

Use the tvm_ffi_embed_cubin CMake function:

# Find tvm_ffi package (provides tvm_ffi_embed_cubin utility)
find_package(tvm_ffi CONFIG REQUIRED)
find_package(CUDAToolkit REQUIRED)

# Compile CUDA kernel to CUBIN
tvm_ffi_generate_cubin(
  OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/kernel.cubin
  SOURCE src/kernel.cu
  ARCH native  # or sm_75, sm_80, etc.
)

# Embed CUBIN into C++ object file
tvm_ffi_embed_cubin(
  OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/mycode_with_cubin.o
  SOURCE src/mycode.cc
  CUBIN ${CMAKE_CURRENT_BINARY_DIR}/kernel.cubin
  NAME my_kernels  # Must match TVM_FFI_EMBED_CUBIN(my_kernels) in code
)

# Link into shared library
add_library(mylib SHARED ${CMAKE_CURRENT_BINARY_DIR}/mycode_with_cubin.o)
target_link_libraries(mylib PRIVATE tvm_ffi_header CUDA::cudart)

Option 2: Python Utility

Use the tvm_ffi.utils.embed_cubin command-line tool:

# Step 1: Compile CUDA kernel to CUBIN
nvcc --cubin -arch=sm_75 kernel.cu -o kernel.cubin

# Step 2: Compile C++ source to object file
g++ -c -fPIC -std=c++17 -I/path/to/tvm-ffi/include mycode.cc -o mycode.o

# Step 3: Embed CUBIN using Python utility
python -m tvm_ffi.utils.embed_cubin \
    --output-obj mycode_with_cubin.o \
    --input-obj mycode.o \
    --cubin kernel.cubin \
    --name my_kernels

# Step 4: Link into shared library
g++ -o mylib.so -shared mycode_with_cubin.o -lcudart

See also

CubinModule

See also

CubinKernel

Usage in C++ Code

In your C++ source file, use the embedded CUBIN:

#include <tvm/ffi/extra/cuda/cubin_launcher.h>

// Declare the embedded CUBIN module (name must match CMake NAME parameter)
TVM_FFI_EMBED_CUBIN(my_kernels);

void MyFunction() {
  // Get kernel from embedded CUBIN (cached in static variable for efficiency)
  static auto kernel = TVM_FFI_EMBED_CUBIN_GET_KERNEL(my_kernels, "my_kernel");
  // Use kernel...
}

Option 3: Python Integration with load_inline

When using tvm_ffi.cpp.load_inline() with the embed_cubin parameter, the CUBIN data is automatically embedded using the Python utility internally:

from tvm_ffi import cpp
from tvm_ffi.cpp import nvrtc

# Compile CUDA source to CUBIN
cubin_bytes = nvrtc.nvrtc_compile(cuda_source)

# Load with embedded CUBIN - automatically handles embedding
mod = cpp.load_inline(
    "my_module",
    cuda_sources=cpp_code,
    embed_cubin={"my_kernels": cubin_bytes},
    extra_ldflags=["-lcudart"]
)

Note

CMake Setup: To use the utilities, add to your CMakeLists.txt:

find_package(tvm_ffi CONFIG REQUIRED)  # Provides tvm_ffi_embed_cubin utility

Parameters:
  • name – The identifier for this embedded CUBIN module (must match the symbol names created with objcopy or the key in embed_cubin dict).