20 #ifndef TVM_RUNTIME_DISCO_CUDA_IPC_MEMORY_H_
21 #define TVM_RUNTIME_DISCO_CUDA_IPC_MEMORY_H_
The CUDA IPC (interprocess communication) memory object, which internally contains data pointers to C...
Definition: cuda_ipc_memory.h:41
static constexpr const bool _type_mutable
Definition: cuda_ipc_memory.h:73
std::vector< void * > barrier_out
The pointers to output barrier signals of all workers for all-reduce. It has "num_workers" pointers,...
Definition: cuda_ipc_memory.h:69
std::vector< void * > barrier_in
The pointers to input barrier signals of all workers for all-reduce. It has "num_workers" pointers,...
Definition: cuda_ipc_memory.h:64
TVM_FFI_DECLARE_OBJECT_INFO("tvm.runtime.disco.cuda_ipc_memory", CUDAIPCMemoryObj, Object)
int barrier_flag
The integer buffer flag for all-reduce.
Definition: cuda_ipc_memory.h:71
int num_workers
The number of GPU workers.
Definition: cuda_ipc_memory.h:44
int worker_id
The worker id corresponding to this IPC memory object.
Definition: cuda_ipc_memory.h:46
std::vector< void * > remote_data
The data pointers of all all-reduce inputs. It has "num_workers" pointers. The i-th pointer is the da...
Definition: cuda_ipc_memory.h:53
Managed reference to CUDAIPCMemoryObj.
Definition: cuda_ipc_memory.h:81
TVM_FFI_DEFINE_OBJECT_REF_METHODS_NULLABLE(CUDAIPCMemory, ObjectRef, CUDAIPCMemoryObj)
static memory::Allocator * GlobalAllocator()
Get the global singleton CUDAIPCMemory allocator.
static CUDAIPCMemory GetIPCMemoryFromDevicePtr(void *ptr)
Given a local CUDA data pointer, return the CUDAIPCMemory object of the pointer.
Definition: memory_manager.h:58
Abstract device memory management API.
Performance counters for profiling via the PAPI library.
Definition: analyzer.h:37
A managed object in the TVM runtime.