Classes
class	LinearCongruentialEngine
	This linear congruential engine is a drop-in replacement for std::minstd_rand. It strictly corresponds to std::minstd_rand and is designed to be platform-independent. More...

class	Span
	A partial implementation of the C++20 std::span. More...

Typedefs
using	PartitionerFuncType = std::function< std::vector< std::vector< int > >(int, int, int, int)>

Functions
std::vector< std::vector< int > >	rr_partitioner (int begin, int end, int step, int num_threads)
	A partitioner to split the task to each thread in Round-robin manner. More...

void	parallel_for (int begin, int end, const std::function< void(int)> &f, int step=1, const PartitionerFuncType partitioner=rr_partitioner)
	A runtime api provided to run the task function in parallel. e.g. A for loop: for (int i = 0; i < 10; i++) { a[i] = i; } should work the same as: parallel_for(0, 10, [&a](int index) { a[i] = i; });. More...

void	parallel_for_dynamic (int begin, int end, int num_threads, const std::function< void(int thread_id, int task_id)> &f)
	An API to launch fix amount of threads to run the specific functor in parallel. Different from `parallel_for`, the partition is determined dynamically on the fly, i.e. any time when a thread is idle, it fetches the next task to run. The behavior is similar to dynamic scheduling in OpenMP: More...

Typedef Documentation

◆ PartitionerFuncType

using tvm::support::PartitionerFuncType = typedef std::function<std::vector<std::vector<int> >(int, int, int, int)>

Function Documentation

◆ parallel_for()

void tvm::support::parallel_for	(	int	begin,
		int	end,
		const std::function< void(int)> &	f,
		int	step = `1`,
		const PartitionerFuncType	partitioner = `rr_partitioner`
	)

A runtime api provided to run the task function in parallel. e.g. A for loop: for (int i = 0; i < 10; i++) { a[i] = i; } should work the same as: parallel_for(0, 10, [&a](int index) { a[i] = i; });.

Parameters

begin	The start index of this parallel loop(inclusive).
end	The end index of this parallel loop(exclusive).
f	The task function to be executed. Assert to take an int index as input with no output.
step	The traversal step to the index.
partitioner	A partition function to split tasks to different threads. Use Round-robin partitioner by default.

Note: 1. Currently do not support nested parallel_for; 2. The order of execution in each thread is not guaranteed, the for loop task should be thread independent and thread safe.

◆ parallel_for_dynamic()

void tvm::support::parallel_for_dynamic	(	int	begin,
		int	end,
		int	num_threads,
		const std::function< void(int thread_id, int task_id)> &	f
	)

An API to launch fix amount of threads to run the specific functor in parallel. Different from parallel_for, the partition is determined dynamically on the fly, i.e. any time when a thread is idle, it fetches the next task to run. The behavior is similar to dynamic scheduling in OpenMP:

#pragma omp parallel for schedule(dynamic) num_threads(num_threads) for (int i = 0; i < 10; i++) { a[i] = i; }

Parameters

begin	The start index of this parallel loop (inclusive).
end	The end index of this parallel loop (exclusive).
num_threads	The number of threads to be used.
f	The task function to be executed. Takes the thread index and the task index as input with no output.

Note: step support is left for future work.

◆ rr_partitioner()

std::vector<std::vector<int> > tvm::support::rr_partitioner	(	int	begin,
		int	end,
		int	step,
		int	num_threads
	)

A partitioner to split the task to each thread in Round-robin manner.

Parameters

begin	The start index of this parallel loop(inclusive).
end	The end index of this parallel loop(exclusive).
step	The traversal step to the index.
num_threads	The number of threads(the number of tasks to be partitioned to).

Returns: A list with num_threads elements, and each is a list of integers indicating the loop indexes for the corresponding thread to process.

Classes

Typedefs

Functions

Typedef Documentation

◆ PartitionerFuncType

Function Documentation

◆ parallel_for()

◆ parallel_for_dynamic()

◆ rr_partitioner()