tvm.relay.image

Image network related operators.

Functions

affine_grid(data[, target_shape])

affine_grid operator that generates 2D sampling grid.

crop_and_resize(data, boxes, box_indices, …)

Crop input images and resize them.

dilation2d(data, weight[, strides, padding, …])

Morphological Dilation 2D.

grid_sample(data, grid[, method, layout])

Applies bilinear sampling to input feature map.

resize(data, size[, layout, method, …])

Image resize operator.

resize3d(data, size[, layout, method, …])

Image resize 3D operator.

tvm.relay.image.affine_grid(data, target_shape=None)

affine_grid operator that generates 2D sampling grid.

This operation is described in https://arxiv.org/pdf/1506.02025.pdf. It generates a uniform sampling grid within the target shape and normalizes it to [-1, 1]. The provided affine transformation is then applied on the sampling grid.

Parameters
  • data (tvm.Tensor) – 3-D with shape [batch, 2, 3]. The affine matrix.

  • target_shape (list/tuple of two int) – Specifies the output shape (H, W).

Returns

Output – 4-D with shape [batch, 2, target_height, target_width]

Return type

tvm.Tensor

tvm.relay.image.crop_and_resize(data, boxes, box_indices, crop_size, layout, method='bilinear', extrapolation_value=0, out_dtype=None)

Crop input images and resize them.

method indicates the algorithm to be used while calculating the out value and method can be either “bilinear” or “nearest_neighbor”.

Parameters
  • data (relay.Expr) – The input data to the operator.

  • boxes (relay.Expr) – A 2-D tensor of shape [num_boxes, 4]. Each row of the tensor specifies the coordinates of a box.

  • box_indices (relay.Expr) – A 1-D tensor of shape [num_boxes], box_ind[i] specifies the data that the i-th box refers to.

  • crop_size (Tuple of Expr) – The target size to which each box will be resized.

  • layout (str, optional) – Layout of the input.

  • method (str, optional) – Scale method, it can be either “nearest_neighbor” or “bilinear”.

  • extrapolation_value (float, optional) – Value used for extrapolation, when applicable.

  • out_dtype (str, optional) – Type to return. If left None returns the same type as input.

Returns

result – The computed result.

Return type

relay.Expr

tvm.relay.image.dilation2d(data, weight, strides=1, 1, padding=0, 0, dilations=1, 1, data_layout='NCHW', kernel_layout='IHW', out_dtype='')

Morphological Dilation 2D. This operator takes the weight as the dilation kernel and dilates it with data to produce an output. In the default case, where the data_layout is NCHW and kernel_layout is OIHW, dilation2d takes in a data Tensor with shape (batch_size, in_channels, height, width), and a weight Tensor with shape (channels, kernel_height, kernel_width) to produce an output Tensor with the following rule:

\[\mbox{out}[b, c, y, x] = \max_{dy, dx} \mbox{data}[b, c, \mbox{strides}[0] * y + dy, \mbox{strides}[1] * x + dx] + \mbox{weight}[c, dy, dx]\]

Padding and dilation are applied to data and weight respectively before the computation. This operator accepts data layout specification. Semantically, the operator will convert the layout to the canonical layout (NCHW for data and IHW for weight) and perform the computation.

weighttvm.relay.Expr

The weight expressions.

stridesOptional[Tuple[int]]

The strides of convolution.

paddingOptional[Tuple[int]]

The padding of convolution on both sides of inputs before convolution.

dilationsOptional[Tuple[int]]

Specifies the dilation rate to be used for dilated convolution.

data_layoutOptional[str]

Layout of the input.

kernel_layoutOptional[str]

Layout of the weight.

out_dtypeOptional[str]

Specifies the output data type.

Returns

result – The computed result.

Return type

tvm.relay.Expr

tvm.relay.image.grid_sample(data, grid, method='bilinear', layout='NCHW')

Applies bilinear sampling to input feature map.

Given \(data\) and \(grid\), then the output is computed by

\[x_{src} = grid[batch, 0, y_{dst}, x_{dst}] \ y_{src} = grid[batch, 1, y_{dst}, x_{dst}] \ output[batch, channel, y_{dst}, x_{dst}] = G(data[batch, channel, y_{src}, x_{src})\]

\(x_{dst}\), \(y_{dst}\) enumerate all spatial locations in \(output\), and \(G()\) denotes the interpolation function. The out-boundary points will be padded with zeros. The shape of the output will be (data.shape[0], data.shape[1], grid.shape[2], grid.shape[3]).

The operator assumes that \(grid\) has been normalized to [-1, 1].

grid_sample often cooperates with affine_grid which generates sampling grids for grid_sample.

Parameters
  • data (tvm.Tensor) – 4-D with shape [batch, in_channel, in_height, in_width]

  • grid (tvm.Tensor) – 4-D with shape [batch, 2, out_height, out_width]

  • method (str) – The interpolation method. Only ‘bilinear’ is supported.

  • layout (str) – The layout of input data and the output.

Returns

Output – 4-D with shape [batch, 2, out_height, out_width]

Return type

tvm.Tensor

tvm.relay.image.resize(data, size, layout='NCHW', method='bilinear', coordinate_transformation_mode='half_pixel', out_dtype=None)

Image resize operator.

This operator takes data as input and does 2D scaling to the given scale factor. In the default case, where the data_layout is NCHW with data of shape (n, c, h, w) out will have a shape (n, c, size[0], size[1])

method indicates the algorithm to be used while calculating the out value and method can be one of (“bilinear”, “nearest_neighbor”, “bicubic”)

Parameters
  • data (relay.Expr) – The input data to the operator.

  • size (Tuple of Expr) – The out size to which the image will be resized.

  • layout (str, optional) – Layout of the input.

  • method (str, optional) – Scale method to used [nearest_neighbor, bilinear, bicubic].

  • coordinate_transformation_mode (string, optional) – Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor. Refer to the ONNX Resize operator specification for details. [half_pixel, align_corners, asymmetric]

  • out_dtype (str, optional) – Type to return. If left None returns the same type as input.

Returns

result – The resized result.

Return type

relay.Expr

tvm.relay.image.resize3d(data, size, layout='NCDHW', method='trilinear', coordinate_transformation_mode='half_pixel', out_dtype=None)

Image resize 3D operator.

This operator takes data as input and does 3D scaling to the given scale factor. In the default case, where the data_layout is NCDHW with data of shape (n, c, d, h, w) out will have a shape (n, c, size[0], size[1], size[2])

method indicates the algorithm to be used while calculating the out value and method can be one of (“trilinear”, “nearest_neighbor”)

Parameters
  • data (relay.Expr) – The input data to the operator.

  • size (Tuple of Expr) – The out size to which the image will be resized.

  • layout (str, optional) – Layout of the input.

  • method (str, optional) – Scale method to used [nearest_neighbor, trilinear].

  • coordinate_transformation_mode (string, optional) – Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor. [half_pixel, align_corners, asymmetric]

  • out_dtype (str, optional) – Type to return. If left None returns the same type as input.

Returns

result – The resized result.

Return type

relay.Expr