- 17 Jun, 2022 6 commits
-
-
Yifan Zhao authored
-
Yifan Zhao authored
InitThreadBind size check failed as extents are symbolic; we print warnings instead
-
Yifan Zhao authored
-
Yifan Zhao authored
feature.cc is not fixed; testing codegen now
-
Yifan Zhao authored
-
Yifan Zhao authored
-
- 16 Jun, 2022 2 commits
-
-
Yifan Zhao authored
-
Yifan Zhao authored
-
- 04 Feb, 2022 5 commits
-
-
Masahiro Masuda authored
* works on resnet18 and deeplabv3 * yolo5 conversion worked * fixed sigmoid * [Torch] Support clamp_min, clamp_max * fixed clamp_min * fixed quantize for 1 dim input * cleanup * improve inline_qparam impl * add clamp_min/max test * add fx quant test * cleanup * skip build in testing * black * improve clamp conversion * leave TODO on inf handling
-
Hongyi Jin authored
-
AndrewZhaoLuo authored
* initial commit * initial commit * update test * jostle
-
Matthew Brookhart authored
* fix static strided slice shape func for out-of-bounds negative stride slicing * Trigger CI * Trigger CI
-
Leo-arm authored
There is some clash with the int8 support that went in at the same time.
-
- 03 Feb, 2022 13 commits
-
-
Elen Kalda authored
Currently we run the network tests only on one NPU variant, so this patch is to enable all the currently supported variants. Change-Id: Ic18054fb4ba19eb28b99ff4439e5a36e57199763
-
Siva authored
-
Yuanjing Shi authored
-
Siyuan Feng authored
-
lhutton1 authored
Fixes the layout optimizer incorrectly assigning layouts for graphs with more complex topologies than previously considered. Specifically, this commit now ensures that intermediate layouts match (e.g. parent output = child input) and that all consumers are taken into account when altering the output layout - something not done previously due to an incorrect traversal order. Previously, the input layout was always altered if the producer was an NPU operation without regard to the output layout of that operation. Additionally, is was possible for the output layout to be incorrectly set due to a depth-first post-order of traversal of the graph, meaning it was possible for not all consumers to be taken into account when altering the layout. Now the `AnalyzeConsumers` pass is run before `LayoutOptimization` which determines a mapping from NPU operation to list of boolean values that represent whether or not each consumer is an NPU operation. Since this is completed before `LayoutOptimization`, all consumers are guaranteed to be taken into account when altering the output layout. In turn, the input layouts can correctly be determined by checking whether the output of the producer will be altered. Change-Id: I04e9605da65fa9f12801109dd50c5e3f08cbc73c
-
Christopher Sidebottom authored
This includes sccache, and these are the less troublesome images. Also see #10120 for update issue.
-
Eric Lunderberg authored
* [TVMScript] Added unit tests demonstrating desired functionality * [TVMScript] Implemented parsing of T.Ptr[...] These can be generated when exporting to TVMscript, but were not parsable after being generated. * [TVMScript] Updated buffer_var printing LetStmt and AllocateNode can both be used to generate handles that are used in Buffer objects. In these cases, the Buffer declarations must go after the handle declaration, not in the function header. * Moved printing of var and buffer_decl into separate statements. * Updated following @shingjan's review comments.
-
Matthew Barrett authored
* [microNPU][3] Plan generation for the cascader The cascader creates 'Plans' which describe how to schedule subgraphs. As part of the cascading algorithm, it's necessary to explore a large variety of Plans which are Pareto optimal (in terms of memory usage and performance). This is done by the Plan generation algorithm. This commit adds the TensorConfig and Plan data structures which hold information on how to schedule the tensors/operators. Additionally, it includes functions to calculate Pareto frontiers which are used to cull sub-optimal Plans. Change-Id: Ia358b2a1b29bd810df4441027752ced75812ad4e * Fixes to lint/test Change-Id: If4e083a3c96af75a8ffa72510704818d21a477d9 * Improve python docs Change-Id: I831137f8235665bc20ab4c060cc7049ffd48088a * Fix enum hashing issue with old gcc Change-Id: Ifbe97eb33b1ef313710f24c687a8155421a3c195
-
Kirill Snezhko authored
* Remove javah support * Remove unused compiler option * Osx pom.xml update
-
cyx666 authored
-
Mehrdad Hessar authored
* Update to zephyr2.7 and Refactor * Temporary for testing * Update cmake version * fix import path and format * Fix test script * address comments * fix path * fix image name
-
wrongtest authored
-
Ashutosh Parkhi authored
-
- 02 Feb, 2022 9 commits
-
-
lhutton1 authored
Adds support for legalizing transpose convolution to a microNPU conv2d operation for the case when strides==(2, 2), dilation==(1, 1) and no padding of the output is required. Change-Id: I485e2571913b3dcd7c75c46304f2f9a82f630ee0
-
Junru Shao authored
-
Jinkun Lin authored
* Fix layout pass * add unit test * fix lint * fix lint * fix lint
-
David Riazati authored
This was set to 1 day instead of 1 week cc @areusch Co-authored-by:
driazati <driazati@users.noreply.github.com>
-
David Riazati authored
* Add bot to ping reviewers after no activity * Address comments Co-authored-by:
driazati <driazati@users.noreply.github.com>
-
Jinkun Lin authored
* fix onnx where bcast * jostle ci * jostle ci * jostle ci
-
Siyuan Feng authored
-
Ashutosh Parkhi authored
-
Ligeng Zhu authored
* Update transform.cc * fix capitalize * fix lint * fix lint
-
- 01 Feb, 2022 5 commits
-
-
Margaret Qian authored
* add relay pass to collect fake quantized ops * add more tests * more tests * lint * lint * remove unused imports * update comment * lint * reuse SubgraphExtractor and update test assertions * remove print * lint * remove unneeded comment Co-authored-by:
Margaret Qian <mqian@octoml.ai>
-
Josh Fromm authored
* Changed the python api to support device. * Finished implementation and updated tests. * Fix typo.
-
Matthew Barrett authored
Update from ethos-u-vela 2.1.1 -> 3.2.0
-
Tristan Konolige authored
* [FIX,AUTOTVM] Add backtraces to tuning errors Collects tracebacks in LocalBuilder and LocalRunner and adds them to the error messages. * formatting * correctly unpack traceback and exception * add assert * fix? * one remaining measureresult * formatting * fixed
-
Masahiro Masuda authored
* add conv2d transpose nhwc cudnn test * support conv2d transpose nhwc direct offload to cudnn * add cutlass dgrad support * remove unused arg * allow target none * fix beta initiaization condition * disable dynamic dense fp16 test since it fails on cuda 11.6
-