tvm
|
Classes | |
class | LinearCongruentialEngine |
This linear congruential engine is a drop-in replacement for std::minstd_rand. It strictly corresponds to std::minstd_rand and is designed to be platform-independent. More... | |
class | Span |
A partial implementation of the C++20 std::span. More... | |
Typedefs | |
using | PartitionerFuncType = std::function< std::vector< std::vector< int > >(int, int, int, int)> |
Functions | |
std::vector< std::vector< int > > | rr_partitioner (int begin, int end, int step, int num_threads) |
A partitioner to split the task to each thread in Round-robin manner. More... | |
void | parallel_for (int begin, int end, const std::function< void(int)> &f, int step=1, const PartitionerFuncType partitioner=rr_partitioner) |
A runtime api provided to run the task function in parallel. e.g. A for loop: for (int i = 0; i < 10; i++) { a[i] = i; } should work the same as: parallel_for(0, 10, [&a](int index) { a[i] = i; });. More... | |
void | parallel_for_dynamic (int begin, int end, int num_threads, const std::function< void(int thread_id, int task_id)> &f) |
An API to launch fix amount of threads to run the specific functor in parallel. Different from parallel_for , the partition is determined dynamically on the fly, i.e. any time when a thread is idle, it fetches the next task to run. The behavior is similar to dynamic scheduling in OpenMP: More... | |
using tvm::support::PartitionerFuncType = typedef std::function<std::vector<std::vector<int> >(int, int, int, int)> |
void tvm::support::parallel_for | ( | int | begin, |
int | end, | ||
const std::function< void(int)> & | f, | ||
int | step = 1 , |
||
const PartitionerFuncType | partitioner = rr_partitioner |
||
) |
A runtime api provided to run the task function in parallel. e.g. A for loop: for (int i = 0; i < 10; i++) { a[i] = i; } should work the same as: parallel_for(0, 10, [&a](int index) { a[i] = i; });.
begin | The start index of this parallel loop(inclusive). |
end | The end index of this parallel loop(exclusive). |
f | The task function to be executed. Assert to take an int index as input with no output. |
step | The traversal step to the index. |
partitioner | A partition function to split tasks to different threads. Use Round-robin partitioner by default. |
void tvm::support::parallel_for_dynamic | ( | int | begin, |
int | end, | ||
int | num_threads, | ||
const std::function< void(int thread_id, int task_id)> & | f | ||
) |
An API to launch fix amount of threads to run the specific functor in parallel. Different from parallel_for
, the partition is determined dynamically on the fly, i.e. any time when a thread is idle, it fetches the next task to run. The behavior is similar to dynamic scheduling in OpenMP:
#pragma omp parallel for schedule(dynamic) num_threads(num_threads) for (int i = 0; i < 10; i++) { a[i] = i; }
begin | The start index of this parallel loop (inclusive). |
end | The end index of this parallel loop (exclusive). |
num_threads | The number of threads to be used. |
f | The task function to be executed. Takes the thread index and the task index as input with no output. |
step
support is left for future work. std::vector<std::vector<int> > tvm::support::rr_partitioner | ( | int | begin, |
int | end, | ||
int | step, | ||
int | num_threads | ||
) |
A partitioner to split the task to each thread in Round-robin manner.
begin | The start index of this parallel loop(inclusive). |
end | The end index of this parallel loop(exclusive). |
step | The traversal step to the index. |
num_threads | The number of threads(the number of tasks to be partitioned to). |
num_threads
elements, and each is a list of integers indicating the loop indexes for the corresponding thread to process.