.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "how_to/tune_with_autotvm/tune_conv2d_cuda.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_how_to_tune_with_autotvm_tune_conv2d_cuda.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_how_to_tune_with_autotvm_tune_conv2d_cuda.py:


Tuning High Performance Convolution on NVIDIA GPUs
=========================================================================
**Author**: `Lianmin Zheng <https://github.com/merrymercy>`_

This is an advanced tutorial for writing high performance tunable template for
NVIDIA GPU. By running auto-tuner on this template, we can outperform the
vendor provided library CuDNN in many cases.

Note that this tutorial will not run on Windows or recent versions of macOS. To
get it to run, you will need to wrap the body of this tutorial in a :code:`if
__name__ == "__main__":` block.

.. GENERATED FROM PYTHON SOURCE LINES 32-50

Install dependencies
--------------------
To use autotvm package in tvm, we need to install some extra dependencies.
(change "3" to "2" if you use python2):

.. code-block:: bash

  pip3 install --user psutil xgboost tornado cloudpickle

To make TVM run faster in tuning, it is recommended to use cython
as FFI of tvm. In the root directory of tvm, execute

.. code-block:: bash

  pip3 install --user cython
  sudo make cython3

Now return to python code. Import packages.

.. GENERATED FROM PYTHON SOURCE LINES 50-62

.. code-block:: default


    import logging
    import sys
    import numpy as np

    import tvm
    from tvm import te, topi, testing
    from tvm.topi.testing import conv2d_nchw_python
    import tvm.testing

    from tvm import autotvm


.. GENERATED FROM PYTHON SOURCE LINES 63-85

Step 1:  Define the search space
--------------------------------
There are plenty of useful schedule primitives in tvm. You can also find
some tutorials that describe them in more details, such as
(1). :ref:`opt-conv-gpu`
(2). `Optimizing DepthwiseConv on NVIDIA GPU <https://tvm.apache.org/2017/08/22/Optimize-Deep-Learning-GPU-Operators-with-TVM-A-Depthwise-Convolution-Example>`_

However, their implementations are manually tuned for some special input
shapes. In this section, we build a large enough space to cover
the techniques used in these tutorials. Then we rely on the efficient auto-tuner
to search through this space and pick some good configurations.

If you are familiar with writing cuda schedule, you can find the following
template is very general. Actually this template can be easily modified
to tune other operators such as depthwise convolution and GEMM.
In order to fully understand this template, you should be familiar with
the schedule primitives and auto tuning API. You can refer to the above
tutorials and :ref:`autotvm tutorial <tutorial-autotvm-matmul-x86>`

It is worth noting that the search space for a conv2d operator
can be very large (at the level of 10^9 for some input shapes)


.. GENERATED FROM PYTHON SOURCE LINES 85-175

.. code-block:: default


    @autotvm.template("tutorial/conv2d_no_batching")
    def conv2d_no_batching(N, H, W, CO, CI, KH, KW, stride, padding):
        assert N == 1, "Only consider batch_size = 1 in this template"

        data = te.placeholder((N, CI, H, W), name="data")
        kernel = te.placeholder((CO, CI, KH, KW), name="kernel")
        conv = topi.nn.conv2d_nchw(data, kernel, stride, padding, dilation=1, out_dtype="float32")
        s = te.create_schedule([conv.op])

        ##### space definition begin #####
        n, f, y, x = s[conv].op.axis
        rc, ry, rx = s[conv].op.reduce_axis

        cfg = autotvm.get_config()
        cfg.define_split("tile_f", f, num_outputs=4)
        cfg.define_split("tile_y", y, num_outputs=4)
        cfg.define_split("tile_x", x, num_outputs=4)
        cfg.define_split("tile_rc", rc, num_outputs=3)
        cfg.define_split("tile_ry", ry, num_outputs=3)
        cfg.define_split("tile_rx", rx, num_outputs=3)
        cfg.define_knob("auto_unroll_max_step", [0, 512, 1500])
        cfg.define_knob("unroll_explicit", [0, 1])
        ##### space definition end #####

        # inline padding
        pad_data = s[conv].op.input_tensors[0]
        s[pad_data].compute_inline()
        data, raw_data = pad_data, data

        output = conv
        OL = s.cache_write(conv, "local")

        # create cache stage
        AA = s.cache_read(data, "shared", [OL])
        WW = s.cache_read(kernel, "shared", [OL])
        AL = s.cache_read(AA, "local", [OL])
        WL = s.cache_read(WW, "local", [OL])

        # tile and bind spatial axes
        n, f, y, x = s[output].op.axis
        bf, vf, tf, fi = cfg["tile_f"].apply(s, output, f)
        by, vy, ty, yi = cfg["tile_y"].apply(s, output, y)
        bx, vx, tx, xi = cfg["tile_x"].apply(s, output, x)
        kernel_scope = n  # this is the scope to attach global config inside this kernel

        s[output].bind(bf, te.thread_axis("blockIdx.z"))
        s[output].bind(by, te.thread_axis("blockIdx.y"))
        s[output].bind(bx, te.thread_axis("blockIdx.x"))
        s[output].bind(vf, te.thread_axis("vthread"))
        s[output].bind(vy, te.thread_axis("vthread"))
        s[output].bind(vx, te.thread_axis("vthread"))
        s[output].bind(tf, te.thread_axis("threadIdx.z"))
        s[output].bind(ty, te.thread_axis("threadIdx.y"))
        s[output].bind(tx, te.thread_axis("threadIdx.x"))
        s[output].reorder(n, bf, by, bx, vf, vy, vx, tf, ty, tx, fi, yi, xi)
        s[OL].compute_at(s[output], tx)

        # tile reduction axes
        n, f, y, x = s[OL].op.axis
        rc, ry, rx = s[OL].op.reduce_axis
        rco, rcm, rci = cfg["tile_rc"].apply(s, OL, rc)
        ryo, rym, ryi = cfg["tile_rx"].apply(s, OL, ry)
        rxo, rxm, rxi = cfg["tile_ry"].apply(s, OL, rx)
        s[OL].reorder(rco, ryo, rxo, rcm, rym, rxm, rci, ryi, rxi, n, f, y, x)

        s[AA].compute_at(s[OL], rxo)
        s[WW].compute_at(s[OL], rxo)
        s[AL].compute_at(s[OL], rxm)
        s[WL].compute_at(s[OL], rxm)

        # cooperative fetching
        for load in [AA, WW]:
            n, f, y, x = s[load].op.axis
            fused = s[load].fuse(n, f, y, x)
            tz, fused = s[load].split(fused, nparts=cfg["tile_f"].size[2])
            ty, fused = s[load].split(fused, nparts=cfg["tile_y"].size[2])
            tx, fused = s[load].split(fused, nparts=cfg["tile_x"].size[2])
            s[load].bind(tz, te.thread_axis("threadIdx.z"))
            s[load].bind(ty, te.thread_axis("threadIdx.y"))
            s[load].bind(tx, te.thread_axis("threadIdx.x"))

        # tune unroll
        s[output].pragma(kernel_scope, "auto_unroll_max_step", cfg["auto_unroll_max_step"].val)
        s[output].pragma(kernel_scope, "unroll_explicit", cfg["unroll_explicit"].val)

        return s, [raw_data, kernel, conv]


.. GENERATED FROM PYTHON SOURCE LINES 176-183

Step 2:  Search through the space
---------------------------------
We pick the last layer on resnet as test case.
Since our space is very large, :code:`XGBoostTuner` is most suitable
for our case. Here we only do 20 trials for demonstration.
In practice, making 1000 trials usually can find some good kernels
for this template

.. GENERATED FROM PYTHON SOURCE LINES 183-212

.. code-block:: default


    # logging config (for printing tuning log to screen)
    logging.getLogger("autotvm").setLevel(logging.DEBUG)
    logging.getLogger("autotvm").addHandler(logging.StreamHandler(sys.stdout))

    # the last layer in resnet
    N, H, W, CO, CI, KH, KW, strides, padding = 1, 7, 7, 512, 512, 3, 3, (1, 1), (1, 1)
    task = autotvm.task.create(
        "tutorial/conv2d_no_batching", args=(N, H, W, CO, CI, KH, KW, strides, padding), target="cuda"
    )
    print(task.config_space)

    # Use local gpu, measure 10 times for every config to reduce variance
    # The timeout of compiling a program is 10 seconds, the timeout for running is 4 seconds
    measure_option = autotvm.measure_option(
        builder=autotvm.LocalBuilder(),
        runner=autotvm.LocalRunner(repeat=3, min_repeat_ms=100, timeout=4),
    )

    # Begin tuning, log records to file `conv2d.log`
    # During tuning we will also try many invalid configs, so you are expected to
    # see many error reports. As long as you can see non-zero GFLOPS, it is okay.
    tuner = autotvm.tuner.XGBTuner(task)
    tuner.tune(
        n_trial=20,
        measure_option=measure_option,
        callbacks=[autotvm.callback.log_to_file("conv2d.log")],
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ConfigSpace (len=10454400, space_map=
       0 tile_f: Split(policy=factors, product=512, num_outputs=4) len=220
       1 tile_y: Split(policy=factors, product=7, num_outputs=4) len=4
       2 tile_x: Split(policy=factors, product=7, num_outputs=4) len=4
       3 tile_rc: Split(policy=factors, product=512, num_outputs=3) len=55
       4 tile_ry: Split(policy=factors, product=3, num_outputs=3) len=3
       5 tile_rx: Split(policy=factors, product=3, num_outputs=3) len=3
       6 auto_unroll_max_step: OtherOption([0, 512, 1500]) len=3
       7 unroll_explicit: OtherOption([0, 1]) len=2
    )
    waiting for device...
    device available
    Get devices for measurement successfully!
    No: 1   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 2]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 128, 2]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 1)],None,7999494
    No: 2   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 8, 2]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 64]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,5194279
    No: 3   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 8, 32, 1]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 2, 64]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9069983
    No: 4   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 16, 16, 1]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 16, 32]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,736818
    No: 5   GFLOPS: 0.00/0.00       result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 4, 32]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 1, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2885496
    No: 6   GFLOPS: 42.35/42.35     result: MeasureResult(costs=(0.005466644736842105,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.599562644958496, timestamp=1658865455.9685578)        [('tile_f', [-1, 1, 1, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 4, 4]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,3754080
    No: 7   GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 16, 32]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 256, 1]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6225319
    No: 8   GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 32]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 8, 64]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,943546
    No: 9   GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 4, 16, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 16, 32]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2868708
    No: 10  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 142, in build
        res = future.result()
      File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
        return self.__get_result()
      File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
        raise self._exception
      File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/workspace/python/tvm/contrib/popen_pool.py", line 404, in <lambda>
        worker = lambda *args: self._worker_run(*args)
      File "/workspace/python/tvm/contrib/popen_pool.py", line 373, in _worker_run
        return proc.recv()
      File "/workspace/python/tvm/contrib/popen_pool.py", line 297, in recv
        raise TimeoutError()
    TimeoutError

            [('tile_f', [-1, 32, 2, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 4, 2]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4691833
    No: 11  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 2, 64]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 4, 4]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,1042124
    No: 12  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 32, 1, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 32, 16]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,10013405
    No: 13  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 8, 8, 2]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 1, 7, 1]), ('tile_rc', [-1, 4, 32]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6732082
    No: 14  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 4, 32]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 1, 1, 1]), ('tile_rc', [-1, 4, 128]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 1)],None,7536735
    No: 15  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 4]), ('tile_y', [-1, 1, 1, 7]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 128, 4]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],None,482121
    No: 16  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 2, 1, 16]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 32, 8]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],None,2824525
    No: 17  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 64, 1, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 8, 8]), ('tile_ry', [-1, 1, 3]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,4559286
    No: 18  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 588, in __call__
        func, arg_info = _build_func_common(measure_input, self.runtime, **kwargs)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 540, in _build_func_common
        func = build(s, args, target_host=task.target_host, runtime=runtime)
      File "/workspace/python/tvm/driver/build_module.py", line 228, in build
        input_mod = lower(inputs, args, name=name, binds=binds)
      File "/workspace/python/tvm/driver/build_module.py", line 134, in lower
        return ffi.lower_schedule(inp, args, name, binds, simple_mode)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 276, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel

    Traceback (most recent call last):
      24: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      23: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      22: Call
            at ../include/tvm/runtime/packed_func.h:1213
      21: operator()
            at ../include/tvm/runtime/packed_func.h:1731
      20: unpack_call<tvm::IRModule, 5, tvm::<lambda(tvm::te::Schedule, const tvm::runtime::Array<tvm::runtime::ObjectRef>&, const tvm::runtime::String&, const tvm::runtime::Map<tvm::te::Tensor, tvm::tir::Buffer>&, bool)> >
            at ../include/tvm/runtime/packed_func.h:1671
      19: run<>
            at ../include/tvm/runtime/packed_func.h:1631
      18: run<tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      17: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      16: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      15: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1631
      14: run<tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_, tvm::runtime::TVMMovableArgValueWithContext_>
            at ../include/tvm/runtime/packed_func.h:1646
      13: operator()
            at ../src/driver/driver_api.cc:391
      12: tvm::LowerSchedule(tvm::te::Schedule, tvm::runtime::Array<tvm::runtime::ObjectRef, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<tvm::te::Tensor, tvm::tir::Buffer, std::hash<tvm::te::Tensor>, std::equal_to<tvm::te::Tensor>, std::allocator<std::pair<tvm::te::Tensor const, tvm::tir::Buffer> > > const&, bool)
            at ../src/driver/driver_api.cc:378
      11: tvm::LowerWithPassList(tvm::IRModule, tvm::runtime::Array<tvm::transform::Pass, void>)
            at ../src/driver/driver_api.cc:278
      10: tvm::transform::Pass::operator()(tvm::IRModule) const
            at ../src/ir/transform.cc:258
      9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      8: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:453
      7: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/ir/transform.cc:274
      6: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
            at ../src/tir/ir/transform.cc:100
      5: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::operator()(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext) const
            at ../include/tvm/runtime/packed_func.h:1750
      4: tvm::tir::PrimFunc tvm::runtime::detail::typed_packed_call_dispatcher<tvm::tir::PrimFunc>::run<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::runtime::PackedFunc const&, tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&)
            at ../include/tvm/runtime/packed_func.h:1694
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext>(tvm::tir::PrimFunc&&, tvm::IRModule&&, tvm::transform::PassContext&&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/c_runtime_api.cc:534
      File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 871, in verify_pass
        raise InstantiationError("Skipped because of invalid gpu kernel")
    tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel        [('tile_f', [-1, 1, 32, 16]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 1, 512]), ('tile_ry', [-1, 3, 1]), ('tile_rx', [-1, 3, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9677544
    No: 19  GFLOPS: 0.00/42.35      result: Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 738, in __call__
        yield remote, remote.load_module(os.path.split(build_result.filename)[1])
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 702, in run_through_rpc
        costs = time_f(*args).results
      File "/workspace/python/tvm/runtime/module.py", line 351, in evaluator
        blob = feval(*args)
      File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
      File "tvm/_ffi/_cython/./packed_func.pxi", line 262, in tvm._ffi._cy3.core.FuncCall
      File "tvm/_ffi/_cython/./packed_func.pxi", line 251, in tvm._ffi._cy3.core.FuncCall3
      File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      4: TVMFuncCall
            at ../src/runtime/c_runtime_api.cc:477
      3: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      2: tvm::runtime::RPCWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../src/runtime/rpc/rpc_module.cc:129
      1: tvm::runtime::RPCClientSession::CallFunc(void*, TVMValue const*, int const*, int, std::function<void (tvm::runtime::TVMArgs)> const&)
            at ../src/runtime/rpc/rpc_endpoint.cc:1009
      0: tvm::runtime::RPCEndpoint::CallFunc(void*, TVMValue const*, int const*, int, std::function<void (tvm::runtime::TVMArgs)>)
            at ../src/runtime/rpc/rpc_endpoint.cc:801
      File "../src/runtime/rpc/rpc_endpoint.cc", line 801
    TVMError: 
    ---------------------------------------------------------------
    An error occurred during the execution of TVM.
    For more information, please see: https://tvm.apache.org/docs/errors.html
    ---------------------------------------------------------------
      Check failed: (code == RPCCode::kReturn) is false: code=kShutdown

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 702, in run_through_rpc
        costs = time_f(*args).results
      File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__
        self.gen.throw(type, value, traceback)
      File "/workspace/python/tvm/autotvm/measure/measure_methods.py", line 742, in __call__
        remote.remove(build_result.filename)
      File "/workspace/python/tvm/rpc/client.py", line 143, in remove
        self._remote_funcs["remove"] = self.get_function("tvm.rpc.server.remove")
      File "/workspace/python/tvm/rpc/client.py", line 71, in get_function
        return self._sess.get_function(name)
      File "/workspace/python/tvm/runtime/module.py", line 171, in get_function
        self.handle, c_str(name), ctypes.c_int(query_imports), ctypes.byref(ret_handle)
      File "/workspace/python/tvm/_ffi/base.py", line 348, in check_call
        raise get_last_ffi_error()
    tvm._ffi.base.TVMError: Traceback (most recent call last):
      52: 0xffffffffffffffff
      51: _start
      50: __libc_start_main
      49: _Py_UnixMain
      48: 0x0000000000650da0
      47: 0x0000000000650afa
      46: _PyFunction_FastCallDict
      45: _PyEval_EvalCodeWithName
      44: _PyEval_EvalFrameDefault
      43: _PyFunction_FastCallKeywords
      42: _PyEval_EvalCodeWithName
      41: _PyEval_EvalFrameDefault
      40: _PyMethodDef_RawFastCallKeywords
      39: 0x0000000000546369
      38: _PyEval_EvalCodeWithName
      37: _PyEval_EvalFrameDefault
      36: _PyFunction_FastCallKeywords
      35: _PyEval_EvalCodeWithName
      34: _PyEval_EvalFrameDefault
      33: _PyFunction_FastCallDict
      32: _PyEval_EvalCodeWithName
      31: _PyEval_EvalFrameDefault
      30: _PyObject_FastCallDict
      29: 0x00000000004c06e1
      28: _PyFunction_FastCallDict
      27: _PyEval_EvalFrameDefault
      26: _PyMethodDescr_FastCallKeywords
      25: 0x00000000005dcb58
      24: 0x00000000005dc83f
      23: 0x00000000004ba127
      22: _PyEval_EvalFrameDefault
      21: _PyFunction_FastCallKeywords
      20: _PyEval_EvalFrameDefault
      19: _PyFunction_FastCallKeywords
      18: _PyEval_EvalFrameDefault
      17: _PyFunction_FastCallKeywords
      16: _PyEval_EvalCodeWithName
      15: _PyEval_EvalFrameDefault
      14: 0x0000000000537c30
      13: _PyObject_FastCallKeywords
      12: 0x00007f517fe40fa2
      11: _ctypes_callproc
      10: ffi_call
      9: ffi_call_unix64
      8: TVMModGetFunction
            at ../src/runtime/c_runtime_api.cc:408
      7: tvm::runtime::ModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
            at ../src/runtime/module.cc:66
      6: tvm::runtime::RPCModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)
            at ../src/runtime/rpc/rpc_module.cc:181
      5: tvm::runtime::RPCClientSession::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
            at ../src/runtime/rpc/rpc_endpoint.cc:1004
      4: tvm::runtime::TVMRetValue tvm::runtime::RPCEndpoint::SysCallRemote<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(tvm::runtime::RPCCode, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
            at ../src/runtime/rpc/rpc_endpoint.h:211
      3: tvm::runtime::TVMRetValue tvm::runtime::PackedFunc::operator()<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(int&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const
            at ../include/tvm/runtime/packed_func.h:1618
      2: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
            at ../include/tvm/runtime/packed_func.h:1217
      1: Call
            at ../include/tvm/runtime/packed_func.h:1213
      0: operator()
            at ../src/runtime/rpc/rpc_endpoint.cc:681
      File "../src/runtime/rpc/rpc_endpoint.cc", line 681
    TVMError: 
    ---------------------------------------------------------------
    An error occurred during the execution of TVM.
    For more information, please see: https://tvm.apache.org/docs/errors.html
    ---------------------------------------------------------------
      Check failed: (code == RPCCode::kReturn) is false: code=1

    Traceback (most recent call last):
      52: 0xffffffffffffffff
      51: _start
      50: __libc_start_main
      49: _Py_UnixMain
      48: 0x0000000000650da0
      47: 0x0000000000650afa
      46: _PyFunction_FastCallDict
      45: _PyEval_EvalCodeWithName
      44: _PyEval_EvalFrameDefault
      43: _PyFunction_FastCallKeywords
      42: _PyEval_EvalCodeWithName
      41: _PyEval_EvalFrameDefault
      40: _PyMethodDef_RawFastCallKeywords
      39: 0x0000000000546369
      38: _PyEval_EvalCodeWithName
      37: _PyEval_EvalFrameDefault
      36: _PyFunction_FastCallKeywords
      35: _PyEval_EvalCodeWithName
      34: _PyEval_EvalFrameDefault
      33: _PyFunction_FastCallDict
      32: _PyEval_EvalCodeWithName
      31: _PyEval_EvalFrameDefault
      30: _PyObject_FastCallDict
      29: 0x00000000004c06e1
      28: _PyFunction_FastCallDict
      27: _PyEval_EvalFrameDefault
      26: _PyMethodDescr_FastCallKeywords
      25: 0x00000000005dcb58
      24: 0x00000000005dc83f
      23: 0x00000000004ba127
      22: _PyEval_EvalFrameDefault
      21: _PyFunction_FastCallKeywords
      20: _PyEval_EvalFrameDefault
      19: _PyFunction_FastCall      [('tile_f', [-1, 8, 2, 16]), ('tile_y', [-1, 7, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 1, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 0), ('unroll_explicit', 1)],None,6390073
    No: 20  GFLOPS: 144.22/144.22   result: MeasureResult(costs=(0.00160524616,), error_no=MeasureErrorNo.NO_ERROR, all_cost=1.4239659309387207, timestamp=1658865481.9063) [('tile_f', [-1, 1, 4, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9881539


.. GENERATED FROM PYTHON SOURCE LINES 213-215

Finally we can inspect the best config from log file, check correctness,
and measure running time.

.. GENERATED FROM PYTHON SOURCE LINES 215-245

.. code-block:: default


    # inspect the best config
    dispatch_context = autotvm.apply_history_best("conv2d.log")
    best_config = dispatch_context.query(task.target, task.workload)
    print("\nBest config:")
    print(best_config)

    # apply history best from log file
    with autotvm.apply_history_best("conv2d.log"):
        with tvm.target.Target("cuda"):
            s, arg_bufs = conv2d_no_batching(N, H, W, CO, CI, KH, KW, strides, padding)
            func = tvm.build(s, arg_bufs)

    # check correctness
    a_np = np.random.uniform(size=(N, CI, H, W)).astype(np.float32)
    w_np = np.random.uniform(size=(CO, CI, KH, KW)).astype(np.float32)
    c_np = conv2d_nchw_python(a_np, w_np, strides, padding)

    dev = tvm.cuda()
    a_tvm = tvm.nd.array(a_np, device=dev)
    w_tvm = tvm.nd.array(w_np, device=dev)
    c_tvm = tvm.nd.empty(c_np.shape, device=dev)
    func(a_tvm, w_tvm, c_tvm)

    tvm.testing.assert_allclose(c_np, c_tvm.numpy(), rtol=1e-2)

    # Evaluate running time. Here we choose a large repeat number (400) to reduce the noise
    # and the overhead of kernel launch. You can also use nvprof to validate the result.
    evaluator = func.time_evaluator(func.entry_name, dev, number=400)
    print("Time cost of this operator: %f" % evaluator(a_tvm, w_tvm, c_tvm).mean)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Finish loading 20 records

    Best config:
    [('tile_f', [-1, 1, 4, 1]), ('tile_y', [-1, 1, 1, 1]), ('tile_x', [-1, 7, 1, 1]), ('tile_rc', [-1, 4, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 3]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 1)],None,9881539
    Finish loading 20 records
    Time cost of this operator: 0.002014


.. _sphx_glr_download_how_to_tune_with_autotvm_tune_conv2d_cuda.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: tune_conv2d_cuda.py <tune_conv2d_cuda.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: tune_conv2d_cuda.ipynb <tune_conv2d_cuda.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_