tvm_ffi.cpp.build

Contents

tvm_ffi.cpp.build#

tvm_ffi.cpp.build(name, *, sources=None, cpp_files=None, cuda_files=None, extra_cflags=None, extra_cuda_cflags=None, extra_ldflags=None, extra_include_paths=None, build_directory=None, backend=None, output=None)[source]#

Compile and build a C/C++/CUDA module from source files.

This function compiles the given C, C++, and/or CUDA source files into a shared library or object file. The compiler is selected automatically based on file extension:

  • .c — compiled with the C compiler ($CC)

  • .cc, .cpp, .cxx — compiled with the C++ compiler ($CXX)

  • .o, .obj — pre-compiled objects, passed directly to the linker

When output is None (the default) or has a shared-library extension, object files are linked into a shared library. When output has an object-file extension (.o, .obj), linking is skipped and the path to the object file is returned.

Note that this function does not automatically export functions to the tvm ffi module. You need to manually use the TVM FFI export macros (e.g., TVM_FFI_DLL_EXPORT_TYPED_FUNC) in your source files to export functions. This gives you more control over which functions are exported and how they are exported.

Extra compiler and linker flags can be provided via the extra_cflags, extra_cuda_cflags, and extra_ldflags parameters. The default flags are generally sufficient for most use cases, but you may need to provide additional flags for your specific use case.

The include dir of tvm ffi and dlpack are used by default for the compiler to find the headers. Thus, you can include any header from tvm ffi in your source files. You can also provide additional include paths via the extra_include_paths parameter and include custom headers in your source code.

The compiled shared library is cached in a cache directory to avoid recompilation. The build_directory parameter is provided to specify the build directory. If not specified, a default tvm ffi cache directory will be used. The default cache directory can be specified via the TVM_FFI_CACHE_DIR environment variable. If not specified, the default cache directory is ~/.cache/tvm-ffi.

The C compiler is controlled by the $CC environment variable (default: cc on Unix, cl on Windows). The C++ compiler is controlled by the $CXX environment variable (default: c++ on Unix, cl on Windows).

Parameters:
  • name (str) – The name of the tvm ffi module.

  • sources (Sequence[str] | str | None, default: None) –

    Source files to compile. The compiler is auto-detected from the file extension:

    • .c → C compiler ($CC)

    • .cc, .cpp, .cxx → C++ compiler ($CXX)

    • .cu → CUDA/HIP compiler (nvcc or hipcc)

    • .o, .obj → pre-compiled objects, passed directly to the linker

    It can be a list of file paths or a single file path.

  • cpp_files (Sequence[str] | str | None, default: None) – Alias for sources, kept for backward compatibility.

  • cuda_files (Sequence[str] | str | None, default: None) – Alias for sources, kept for backward compatibility.

  • extra_cflags (Sequence[str] | None, default: None) –

    Extra compiler flags applied to both C and C++ compilation. The C++ default flags are:

    • On Linux/macOS: [‘-std=c++17’, ‘-fPIC’, ‘-O2’]

    • On Windows: [‘/std:c++17’, ‘/MD’, ‘/O2’]

    The C default flags omit -std=c++17 and /EHsc.

  • extra_cuda_cflags (Sequence[str] | None, default: None) –

    The extra compiler flags for CUDA compilation. The default flags are:

    • [‘-Xcompiler’, ‘-fPIC’, ‘-std=c++17’, ‘-O2’] (Linux/macOS)

    • [‘-Xcompiler’, ‘/std:c++17’, ‘/O2’] (Windows)

  • extra_ldflags (Sequence[str] | None, default: None) –

    The extra linker flags. The default flags are:

    • On Linux/macOS: [‘-shared’, ‘-L<tvm_ffi_lib_path>’, ‘-ltvm_ffi’]

    • On Windows: [‘/DLL’, ‘/LIBPATH:<tvm_ffi_lib_path>’, ‘<tvm_ffi_lib_name>.lib’]

  • extra_include_paths (Sequence[str] | None, default: None) – The extra include paths for header files. Both absolute and relative paths are supported.

  • build_directory (str | None, default: None) – The build directory. If not specified, a default tvm ffi cache directory will be used. By default, the cache directory is ~/.cache/tvm-ffi. You can also set the TVM_FFI_CACHE_DIR environment variable to specify the cache directory.

  • backend (str | None, default: None) – The GPU backend to use. It can be “cuda” or “hip”. If not specified, the backend will be automatically determined based on the available GPU and the provided source code.

  • output (str | None, default: None) – Output filename that determines the build type from its extension. When None (the default), builds a shared library (.so on Unix, .dll on Windows). Use an object-file extension (e.g., "my_ops.o") to skip linking and produce a relocatable object file. The file is placed in the build directory.

Return type:

str

Returns:

path (str) – The path to the built shared library or object file.

Example

import torch
from tvm_ffi import Module
import tvm_ffi.cpp

# Assume we have a C++ source file "my_ops.cpp" with the following content:
# ```cpp
# #include <tvm/ffi/container/tensor.h>
# #include <tvm/ffi/dtype.h>
# #include <tvm/ffi/error.h>
# #include <tvm/ffi/extra/c_env_api.h>
# #include <tvm/ffi/function.h>
#
# void add_one_cpu(tvm::ffi::TensorView x, tvm::ffi::TensorView y) {
#   TVM_FFI_ICHECK(x.ndim() == 1) << "x must be a 1D tensor";
#   DLDataType f32_dtype{kDLFloat, 32, 1};
#   TVM_FFI_ICHECK(x.dtype() == f32_dtype) << "x must be a float tensor";
#   TVM_FFI_ICHECK(y.ndim() == 1) << "y must be a 1D tensor";
#   TVM_FFI_ICHECK(y.dtype() == f32_dtype) << "y must be a float tensor";
#   TVM_FFI_ICHECK(x.size(0) == y.size(0)) << "x and y must have the same shape";
#   for (int i = 0; i < x.size(0); ++i) {
#     static_cast<float*>(y.data_ptr())[i] = static_cast<float*>(x.data_ptr())[i] + 1;
#   }
# }
#
# TVM_FFI_DLL_EXPORT_TYPED_FUNC(add_one_cpu, add_one_cpu);
# ```

# compile the cpp source file and get the library path
lib_path: str = tvm_ffi.cpp.build(
    name="my_ops",
    sources="my_ops.cpp",
)

# load the module
mod: Module = tvm_ffi.load_module(lib_path)

# use the function from the loaded module
x = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32)
y = torch.empty_like(x)
mod.add_one_cpu(x, y)
torch.testing.assert_close(x + 1, y)