Tengine

Tengine is a lite, high performance, modular inference engine for embedded device

APACHE-2.0 License

Stars
4.6K

Bot releases are hidden (Show)

Tengine - Tengine Lite release v1.5 for NVDLA Latest Release

Published by BUG1989 about 3 years ago

Release v1.5 for NVDLA

Baseline version

  • lite-v1.5

Hardware backend support

  • Zynq UltraScale+ MPSoC ZCU102

Software Depend

  • Ubuntu 20.04
  • OpenCV 4.2
  • gcc 9.3.0
  • cmake 3.16.3

NVDLA type support

  • Small

NVDLA Operator support

  • Batchnorm
  • Concat
  • Convolution
  • Deconvolution
  • Eltwise
  • FC
  • Pooling
  • ReLU
  • Scale
  • Split

NVDLA Network support

Models Input Size Inference Time of ZCU102+NVDLA (ms)
ResNet18 3x32x32 12.6
YOLOv3-Tiny-ReLU 3x416x416 630.5
YOLOX-Nano-ReLU 3x416x416 1138.8

Tengine NVDLA example support

Reference Documents

The Ubuntu image for ZCU102

Tengine - Tengine Lite release v1.5

Published by BUG1989 about 3 years ago

Release v1.5

New Demos and Examples

  • Pipeline Demos
    • face enroll
    • pedestrian distance estimation
    • arcface
    • centerface
    • scrfd
    • yolo
  • Examples
    • YOLOX
    • Segformer
    • Seghuman
    • Scrfd

New hardware backend support

  • Support NVDLA by OpenDLA

New Tools support

  • Align tool
    • ONNX align tool for compare the original onnx model with tmfile model
  • Convert tools
    • ONNX
    • Caffe
    • MXNet
    • Darknet
    • TensorFlow (WIP)
    • TFLite (WIP)
  • Optimize tools
    • segformer-opt
  • Quantization tools
    • ACIQ
    • DFQ
    • EasyQuant

New Online Documents

  • Remove the Markdown files of Online Documents into master branch

New Feature

  • Refactor the python api

CI/CD

  • Add model test module in CI action
  • Add operator test module in CI action
  • Add Backend devices runner in CI action
    • Khadas VIM3
    • Jeston AGX

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • A311D
    • S905D3
    • RV1109
    • RV1126
    • i.MX 8M Plus
    • JA310
  • NVDLA we have tested with the following devices
    • ZCU102
Tengine - Tengine Lite release v1.4 for SuperEdge

Published by BUG1989 about 3 years ago

Release v1.4 for SuperEdge

Baseline version

  • lite-v1.4

Hardware backend support

  • Khadas VIM3 (A311D)

Software Depend

  • Ubuntu 20.04
  • OpenCV 4.2
  • gcc 9.3.0
  • cmake 3.16.3

NPU Network support

Models Inference Time of A311D (ms)
MobileNet v1 4.3
MobileNet v2 5.2
ResNet18 5.5
ResNet50 14.6
SqueezeNet v1.1 2.6
VGG16 18.7
YOLOv3 78.6
YOLOv5s 68.9
YOLOX-S 55.2
Tengine - Tengine Lite release v1.4 for Amlogic

Published by BUG1989 over 3 years ago

Release v1.4 for Amlogic

Baseline version

  • lite-v1.4

Hardware backend support

  • A311D
  • S905D3

NPU Network support

Models Inference Time of A311D (ms)
MobileNet v1 4.3
MobileNet v2 5.2
ResNet18 5.5
ResNet50 14.6
SqueezeNet v1.1 2.6
VGG16 18.7
YOLOv3 78.6
YOLOv5s 68.9
Tengine - Tengine Lite release v1.4 for Allwinner

Published by BUG1989 over 3 years ago

Release v1.4 for Allwinner

Baseline version

  • lite-v1.4

Hardware backend support

  • D1(RISC-V C906)

CPU Network support

Models
MobileNet v1
MobileNet v2
ResNet18
SqueezeNet v1.1
YOLO-Fastest
Tengine - Tengine Lite release v1.4 for NXP

Published by BUG1989 over 3 years ago

Release v1.4 for NXP

Baseline version

  • lite-v1.4

Hardware backend support

  • i.MX 8M Plus

NPU Network support

Models Inference Time(ms)
MobileNet v1 2.3
MobileNet v2 5.1
ResNet18 4.5
ResNet50 11.7
SqueezeNet v1.1 2.5
VGG16 22.8
YOLOv3 78.2
Tengine - Tengine Lite release v1.4

Published by BUG1989 over 3 years ago

Release v1.4

New hardware backend support

  • Support RISC-V CPU for C906/C910
  • Support NV/AMD/Mali GPU by OpenCL

New Training Framework‘s model support

  • The tengine-convert-tool now supports PaddlePaddle 2.0 format models and will continue the work per users' requests. (please leave your requests on our Github issues)

Fix error

  • Refactor the code of register module
  • Refactor the code of compile module to support Visual Studio

CI/CD

  • Add code quality module in CI action

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • A311D
    • S905D3
    • i.MX 8M Plus
    • JA310
Tengine - Tengine Lite release v1.3

Published by BUG1989 over 3 years ago

Release v1.3

New hardware backend support

  • Support NV GPU by CUDA and cuDNN
  • Support NV GPU by TensorRT Plugin
  • Support VeriSilicon NPU by TIM-VX Plugin

New Training Framework‘s model support

  • The tengine-convert-tool try to support the model of OneFlow

Fix error

  • Refactor the code of ACL Plugin to fix the bug of compile or inference on Mali GPU

CI/CD

  • Add code coverage mode in CI action
  • Add model test in CI action, such as classification, detection, recognition and segmentation

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • Khadas VIM3
Tengine - Tengine Lite release v1.2

Published by BUG1989 almost 4 years ago

Release v1.2

New feature

  • CPU affinity API
  • CPU profile tool
  • Inference mode support Int8 (symmetric, perchannel)
  • Release quantization Tools (Int8, UInt8)
  • Support compile with HarmonyOS
  • Support compile with Visual Studio 2019

New network support

  • alphapose
  • crnn
  • yolov4_tiny

New operator support

  • int8 reference op (experiment)

Performance

  • int8 peformance op with armv7/v8 (experiment)
  • int8 peformance op with x86-64 (experiment)
Tengine - Tengine Lite release v1.2-pre

Published by BUG1989 almost 4 years ago

  • Tengine-Lite 将开放第一个 NPU 版本给到广大的开发者试用
    • 目前我们 Pre-Release 的是 Amlogic 的一颗带 NPU 的芯片 A311D,合作方式是在 Single Board Computer(SBC)-Khadas vim3上预装此试用版本-传送门;
    • 为了配合预装版本的发布,需要在开源社区开放相应的模型转换工具和模型量化工具;
    • 由于涉及第三方知识产权的问题,暂时还无法开源相关源代码;
    • 欢迎大家尝试在 Khadas VIM3(311D) 上试用 Tengine Lite 所支持的 NPU 最新特性;
    • 闲来大佬打造的真-异构计算(闲来大佬甚至还自己做了个3D打印外壳-传送门);
    • 欢迎大家提出宝贵的建议,可以来我们 QQ 群交流!
  • 我们也在和更多的开源开发板、SBC 公司合作,会有其他开发板,敬请期待;
  • 我们也在同更多的 NPU 厂商合作,欢迎感兴趣的小伙伴咨询或加入我们共建 Tengine 开源生态!
Tengine - Tengine Lite release v1.0

Published by BUG1989 about 4 years ago

Release v1.0

New feature

  • Dynamic graph segmentation

  • C++ API (experiment)

  • Python API (experiment)

  • support ARM-Mali GPU with ACL

  • support others GPU with Vulkan (experiment)

  • support fp16 inference with armv8.2 (experiment)

New network support

  • landmark
  • yolact
  • openpose
  • yolov4

New operator support

  • uint8 reference op (experiment)

  • mish activation op

Performance

  • update the performance of openmp
Tengine - lite-v0.1

Published by BUG1989 over 4 years ago

Initial Tengine Lite release v0.1

Tengine -

Published by satosa-z over 4 years ago

Tengine - Separate tengine serializer into tengine-module

Published by satosa-z almost 5 years ago

Tengine - v1.3.2

Published by cyberfire over 5 years ago

Separate cpu operator implementation and the framework into two so.
Add serializer for TFLite, and reference implementation on TFLite op.
Add RNN/GRU/LSTM reference implementation

Tengine - Release 1.0.0

Published by cyberfire almost 6 years ago

With the new API 2.0 and a few new features and bug fixes.

Tengine - v0.8.0

Published by cyberfire almost 6 years ago

Android build to run ACL
MSSD can use GPU to accelerate
Android build with c++_shared instead of gnustl_shared

Tengine - v0.7.2

Published by cyberfire almost 6 years ago

Support GPU fp16. Only works with ACL 18.05
More tensorflow model and onnx model support

Tengine - Release 0.5

Published by cyberfire over 6 years ago

This is a first version which implements many basic features for an inference engine