The CUDA IPC (interprocess communication) memory object, which internally contains data pointers to CUDA IPC memory. It is be useful for efficient all-reduce implementation.
More...
#include <cuda_ipc_memory.h>
|
int | num_workers |
| The number of GPU workers. More...
|
|
int | worker_id |
| The worker id corresponding to this IPC memory object. More...
|
|
std::vector< void * > | remote_data |
| The data pointers of all all-reduce inputs. It has "num_workers" pointers. The i-th pointer is the data pointer on worker i. If "i != worker_id", the pointer is an IPC data pointer. Otherwise, the pointer is a local CUDA data pointer. More...
|
|
std::vector< void * > | barrier_in |
| The pointers to input barrier signals of all workers for all-reduce. It has "num_workers" pointers, and the pointer arrangement is the same as "remote_data". More...
|
|
std::vector< void * > | barrier_out |
| The pointers to output barrier signals of all workers for all-reduce. It has "num_workers" pointers, and the pointer arrangement is the same as "remote_data". More...
|
|
int | barrier_flag |
| The integer buffer flag for all-reduce. More...
|
|
The CUDA IPC (interprocess communication) memory object, which internally contains data pointers to CUDA IPC memory. It is be useful for efficient all-reduce implementation.
- Note
- Right now the class members are closely tied with customized all-reduce kernel. They may also be extended for other uses in the future.
◆ TVM_DECLARE_BASE_OBJECT_INFO()
◆ _type_has_method_sequal_reduce
constexpr const bool tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::_type_has_method_sequal_reduce = false |
|
staticconstexpr |
◆ _type_has_method_shash_reduce
constexpr const bool tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::_type_has_method_shash_reduce = false |
|
staticconstexpr |
◆ _type_key
constexpr const char* tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::_type_key = "tvm.runtime.disco.cuda_ipc_memory" |
|
staticconstexpr |
◆ barrier_flag
int tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::barrier_flag |
The integer buffer flag for all-reduce.
◆ barrier_in
std::vector<void*> tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::barrier_in |
The pointers to input barrier signals of all workers for all-reduce. It has "num_workers" pointers, and the pointer arrangement is the same as "remote_data".
◆ barrier_out
std::vector<void*> tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::barrier_out |
The pointers to output barrier signals of all workers for all-reduce. It has "num_workers" pointers, and the pointer arrangement is the same as "remote_data".
◆ num_workers
int tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::num_workers |
The number of GPU workers.
◆ remote_data
std::vector<void*> tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::remote_data |
The data pointers of all all-reduce inputs. It has "num_workers" pointers. The i-th pointer is the data pointer on worker i. If "i != worker_id", the pointer is an IPC data pointer. Otherwise, the pointer is a local CUDA data pointer.
◆ worker_id
int tvm::runtime::cuda_ipc::CUDAIPCMemoryObj::worker_id |
The worker id corresponding to this IPC memory object.
The documentation for this class was generated from the following file: