| tvm
    | 
| Classes | |
| struct | DeviceWrapperNode | 
| Wrapper for DevicebecauseDeviceis not passable across the ffi::Function interface.  More... | |
| class | DeviceWrapper | 
| Wrapper for Device.  More... | |
| class | ReportNode | 
| Data collected from a profiling run. Includes per-call metrics and per-device metrics.  More... | |
| class | Report | 
| class | MetricCollectorNode | 
| Interface for user defined profiling metric collection.  More... | |
| class | MetricCollector | 
| Wrapper for MetricCollectorNode.  More... | |
| struct | CallFrame | 
| class | Profiler | 
| class | DurationNode | 
| class | PercentNode | 
| class | CountNode | 
| class | RatioNode | 
| Functions | |
| MetricCollector | CreatePAPIMetricCollector (ffi::Map< DeviceWrapper, ffi::Array< ffi::String >> metrics) | 
| Construct a metric collector that collects data from hardware performance counters using the Performance Application Programming Interface (PAPI).  More... | |
| ffi::String | ShapeString (const std::vector< Tensor > &shapes) | 
| ffi::String representation of an array of Tensor shapes  More... | |
| ffi::String | ShapeString (Tensor shape, DLDataType dtype) | 
| ffi::String representation of shape encoded as an Tensor  More... | |
| ffi::String | ShapeString (const std::vector< int64_t > &shape, DLDataType dtype) | 
| ffi::String representation of a shape encoded as a vector  More... | |
| ffi::Function | ProfileFunction (ffi::Module mod, std::string func_name, int device_type, int device_id, int warmup_iters, ffi::Array< MetricCollector > collectors) | 
| Collect performance information of a function execution. Usually used with a compiled PrimFunc (via tvm.compile).  More... | |
| ffi::Function | WrapTimeEvaluator (ffi::Function f, Device dev, int number, int repeat, int min_repeat_ms, int limit_zero_time_iterations, int cooldown_interval_ms, int repeats_to_cooldown, int cache_flush_bytes=0, ffi::Function f_preproc=nullptr) | 
| Wrap a timer function to measure the time cost of a given packed function.  More... | |
| MetricCollector tvm::runtime::profiling::CreatePAPIMetricCollector | ( | ffi::Map< DeviceWrapper, ffi::Array< ffi::String >> | metrics | ) | 
Construct a metric collector that collects data from hardware performance counters using the Performance Application Programming Interface (PAPI).
| metrics | A mapping from a device type to the metrics that should be collected on that device. You can find the names of available metrics by running papi_native_avail. | 
| ffi::Function tvm::runtime::profiling::ProfileFunction | ( | ffi::Module | mod, | 
| std::string | func_name, | ||
| int | device_type, | ||
| int | device_id, | ||
| int | warmup_iters, | ||
| ffi::Array< MetricCollector > | collectors | ||
| ) | 
Collect performance information of a function execution. Usually used with a compiled PrimFunc (via tvm.compile).
This information can include performance counters like cache hits and FLOPs that are useful in debugging performance issues of individual PrimFuncs. Different metrics can be collected depending on which MetricCollector is used.
Example usage:
| mod | Module to profile. Usually a PrimFunc that has been compiled to machine code. | 
| func_name | Name of function to run in the module. | 
| device_type | Device type to run on. Profiling will include performance metrics specific to this device type. | 
| device_id | Id of device to run on. | 
| warmup_iters | Number of iterations of the function to run before collecting performance information. Recommend to set this larger than 0 so that cache effects are consistent. | 
| collectors | List of different ways to collect metrics. See MetricCollector. | 
mod[func_name] and returns performance metrics as a ffi::Map<ffi::String, ffi::Any> where values can be CountNode, DurationNode, PercentNode. | ffi::String tvm::runtime::profiling::ShapeString | ( | const std::vector< int64_t > & | shape, | 
| DLDataType | dtype | ||
| ) | 
ffi::String representation of a shape encoded as a vector
| shape | Shape as a vector of integers. | 
| dtype | The dtype of the shape. | 
float32[2]. | ffi::String tvm::runtime::profiling::ShapeString | ( | const std::vector< Tensor > & | shapes | ) | 
ffi::String representation of an array of Tensor shapes
| shapes | Array of Tensors to get the shapes of. | 
float32[2], int64[1, 2]. | ffi::String tvm::runtime::profiling::ShapeString | ( | Tensor | shape, | 
| DLDataType | dtype | ||
| ) | 
| ffi::Function tvm::runtime::profiling::WrapTimeEvaluator | ( | ffi::Function | f, | 
| Device | dev, | ||
| int | number, | ||
| int | repeat, | ||
| int | min_repeat_ms, | ||
| int | limit_zero_time_iterations, | ||
| int | cooldown_interval_ms, | ||
| int | repeats_to_cooldown, | ||
| int | cache_flush_bytes = 0, | ||
| ffi::Function | f_preproc = nullptr | ||
| ) | 
Wrap a timer function to measure the time cost of a given packed function.
Approximate implementation:
| f | The function argument. | 
| dev | The device. | 
| number | The number of times to run this function for taking average. We call these runs as one repeatof measurement. | 
| repeat | The number of times to repeat the measurement. In total, the function will be invoked (1 + number x repeat) times, where the first one is warm up and will be discarded. The returned result contains repeatcosts, each of which is an average ofnumbercosts. | 
| min_repeat_ms | The minimum duration of one repeatin milliseconds. By default, onerepeatcontainsnumberruns. If this parameter is set, the parametersnumberwill be dynamically adjusted to meet the minimum duration requirement of onerepeat. i.e., When the run time of onerepeatfalls below this time, thenumberparameter will be automatically increased. | 
| limit_zero_time_iterations | The maximum number of repeats when measured time is equal to 0. It helps to avoid hanging during measurements. | 
| cooldown_interval_ms | The cooldown interval in milliseconds between the number of repeats defined by repeats_to_cooldown. | 
| repeats_to_cooldown | The number of repeats before the cooldown is activated. | 
| cache_flush_bytes | The number of bytes to flush from cache before | 
| f_preproc | The function to be executed before we execute time evaluator. |