armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn

MIT License

Stars
1.2K

Bot releases are visible (Hide)

armnn - Release 20.02

Published by nikraj01 over 4 years ago

New Features:

  • Added per-channel quantization support for Convolution2d, DepthwiseConvolution2d on CpuAcc & GpuAcc backend.
  • Added post-optimized network structure for external profiling support.
  • Added inference timeline trace for external profiling support.
  • Added further support to Quantize CpuRef workload to make use of Decoder/Encoder types. This allows for requantize operations.
  • Added NEON support for:
    • SpaceToBatchNd
    • Division
  • Added ElementwiseUnaryLayer with ops for Abs, Exp, Sqrt, RSqrt and Neg. Standalone layers for Abs and RSqrt are now deprecated.
  • Added Support for Signed quantized data types (QSymmS8 & QAsymmS8).
  • Added sample of standalone dynamic backend.
  • Added DynamicSample as a basic example of using the ArmNN SDK API with the standalone sample dynamic backend.

TfLite Parser:

  • Added support for RESIZE_NEAREST_NEIGHBOR.
  • Added support for DEQUANTIZE.
  • Added support for QUANTIZE.
  • Updated required TfLite version to 1.15.
  • Added support for new quantized data types introduced in TfLite version 1.15.
  • Added support for variable Flatbuffer libs for debug and release.

Tensorflow Parser:

  • Added support for StridedSlice.
  • Added support for Pack/Stack.

ArmNN Serializer/Deserializer:

  • Added support for deserialization to the following ArmNN layers:

    • ElementwiseUnary
    • Resize
    • ResizeBilinear
  • Added support for serialization to the following ArmNN layers:

    • ResizeBilinear

Public API Changes:

  • Added New API to IRuntime::CreationOptions for passing parameters direct to the backends.

Backend API Changes:

  • Added WorkloadFactoryBase class with default empty implementations. (Analogous to the existing LayerSupportBase class).

Bug Fixes:

  • Fixed ONNX Parser bug where segmentation fault was occurring.
  • Fixed SendCounterPacket hanging for indefinite time.
  • Fixed compilation error when building for Linux (non Android).
  • Fixed issues due to #include Windows.h.
  • Fixed build error on gcc 7+ for implicit switch statement fallthroughs.
  • Fixed issue in Serializer where models with multiple inputs/outputs could be serialized with incorrect binding ids.

Other changes:

  • ~15% reduction in binary size by replacing boost logging with custom lightweight logger.
  • ArmNN can now be built without warnings with -Wextra compile flag.
  • Deprecated DataType::QuantizedSymm8PerAxis. Instead this behaviour can be selected by setting the data type to DataType::QSymmS8 and setting multiple scale values on the TensorInfo quantization parameters.
  • Fixed crash when running ArmNN on a system without OpenCL drivers and a GpuAcc backend is present.
  • Initial documentation implemented via Doxygen.

Known issues:

  • External profiling for ArmNN on the Raspberry Pi platform is currently not fully supported and will result in some External Profiling unit tests failing.
armnn -

Published by nikraj01 over 4 years ago

ArmNN 19.11.1 Release Notes

This is an incremental release of ArmNN 19.11 to fix CTS issues.

ArmNN SDK

New Features:

TfLite Parser:

Public API Changes:

  • The IsReshapeSupported(const TensorInfo& input, const ReshapeDescriptor& descriptor, Optionalstd::string& reasonIfUnsupported = EmptyOptional()) in ILayerSupport has been deprecated.
  • IsReshapeSupported(const TensorInfo& input, const TensorInfo& output, const ReshapeDescriptor& descriptor, Optionalstd::string& reasonIfUnsupported = EmptyOptional()) has been added to ILayerSupport.

Backend API Changes:

Other changes:

Known issues:

armnn -

Published by nikraj01 over 4 years ago

ArmNN 19.08.01 Release Notes

This is an incremental release of ArmNN 19.08 to fix CTS issues.

ArmNN SDK

New Features:

TfLite Parser:

Public API Changes:

Backend API Changes:

Other changes:

Known issues:

Android NNAPI driver

Deprecated features:

New Features:

Other changes:

All errors and crashes occurring on the 19.08 release when running the Android Compliance Test Suite (CTS) R2 on Android 10 (Android Q) have been fixed, including:

  • Driver termination during TestRandomGraph when using GPU acceleration (ie. ARMNN_COMPUTE_CL_ENABLE:=1)
  • Some TestRandomGraph/RandomGraphTest tests which include CONCATENATION and L2_POOLING_2D operators.
  • Some TestRandomGraph/RandomGraphTest tests which include operators taking the optional data layout argument if the argument is present and set to NCHW.
  • Some TestRandomGraph/RandomGraphTest tests which include operators using FLOAT16 input.
  • Some TestRandomGraph/RandomGraphTest tests which include RESIZE_BILINEAR operators.
  • Some TestRandomGraph/RandomGraphTest tests which include RESIZE operators.
  • Some TestRandomGraph/RandomGraphTest tests which include RESIZE_NEAREST_NEIGHBOR operators.
  • Some TestRandomGraph/RandomGraphTest tests which include SPACE_TO_DEPTH operators.
  • TestRandomGraph/SingleOperationTest#ADD_V1_0/31
  • TestRandomGraph/SingleOperationTest#MUL_V1_0/31
  • TestRandomGraph/SingleOperationTest#SUB_V1_2/31
  • TestRandomGraph/SingleOperationTest#STRIDED_SLICE_V1_2/17
  • TestRandomGraph/SingleOperationTest#PRELU_V1_2/14
  • Several other errors occurring in Activations when debug.nn.partition is set to 2

Backend API Changes:

Known Issues:

armnn - Release 19.11

Published by nikraj01 almost 5 years ago

New Features:

  • Added Abs support to CpuRef, CpuAcc and GpuAcc backend.
  • Added Comparison support to CpuRef, covering the following operations: Equal, Greater, GreaterOrEqual, Less, LessOrEqual, NotEqual. Refactored the Equal and Greater layers previously present in terms of the new Comparison layer.
  • Added Rsqrt support to CpuAcc and GpuAcc backend.
  • Added ArgMinMax support to CpuRef, CpuAcc and GpuAcc backend.
  • Added InstanceNormalization support to CpuRef, CpuAcc and GpuAcc backend.
  • Added LogSoftmax support to CpuRef backend.
  • Added Slice support to CpuAcc backend.
  • Added DepthToSpace support to CpuRef, CpuAcc and GpuAcc backend.
  • Added StandIn Layer which is a layer to represent "unknown" or "unsupported" operations in the input graph. StandIn layer has a configurable number of input and output slots. No workloads created for StandIn layer.
  • Added QSymm8PerAxis support for Encoder and Decoder.
  • Added per-channel quantization support for Convolution2d, DepthwiseConvolution2d and TransposeConvolution2d on CpuRef backend.
  • Added FSRCNN support to CpuRef (fp32 and uint8), CpuAcc (fp32 and uint8) and GpuAcc (fp32) backend.
  • Added initial external profiling support. A new ProfilingService class allows to connect to an external profiling service and to exchange an initial set of counter metadata, such as advertising a list of counters the client can select from, and periodically send the values of the selected counters to the client. The profiling support is compatible with DS5 and Streamline clients. The profiling service relies on gatord to forward the packets to the external profiling server.
  • Added utility functions for creating Timeline Packets:
    • Timeline Label Binary Packet
    • Timeline Entity Binary Packet
    • Timeline Event Class Binary Packet
    • Timeline Message Directory Package
    • Timeline Event Binary Packet
  • Added SendTimelinePacket implementation to send Timeline Packets:
    • Timeline Label Binary Packet
    • Timeline Entity Binary Packet
    • Timeline Event Class Binary Packet
    • Timeline Message Directory Package
    • Timeline Event Binary Packet
  • Added TimelineUtilityMethods class to manage profioling entities
  • Added utility function to create a named typed entity
  • Added utility function to create a named typed child entity
  • Added utility function to create a typed label
  • Added utility function to declare a label
  • Added utility function to record an event
  • Added Timeline Decoder
  • Added ITimelineDecoder C interface
  • Added an example implementation of ITimelineDecoder
  • Added command handlers for the timeline directory and objects

TfLite Parser:

  • Added support for Transpose.
  • Added support for parsing unsupported layers by representing them as a placeholder StandInLayer in the resulting Armn NN network. Please note that such networks will not be executable, as there are no workloads for StandInLayer – its only purpose is to maintain the original network topology.
  • Fixed a bug in parsing custom layers that caused the TfLiteParser to attempt to parse all custom layers as a DetectionPostProcess layer. Now unsupported custom layers are parsed as a StandInLayer – similarly to unsupported built-in layers.
  • Added support for Slice.

Public API Changes:

Backend API Changes:

  • New CreateTensorHandle functions have been added to ITensorHandleFactory to allow for the creation of TensorHandles with unmanaged memory.

Other changes:

  • Modified ExecuteNetwork so that it can generate dummy input data if no input data files are specified. This can be useful when the user is not interested in inference results, but in performance metrics or if they only wish to see whether Arm NN can execute a certain network.
  • CTS bug fix in pooling layers on assessing when the kernel is solely over padding values.
  • Change to algorithm for calculating subgraphs to submit to backends for optimisation to remove dependency cycles and unwanted subgraph splitting.
  • Added Encoder and Decoder support to Dequantize layer.

Known issues:

armnn - Release 19.08

Published by Surmeh about 5 years ago

armnn - Release 19.05

Published by Surmeh over 5 years ago

armnn - Release 19.02

Published by TelmoARM over 5 years ago

New Features:

  • Maximum operator support for CpuRef and CpuAcc backend.
  • Minimum operator support for CpuRef, CpuAcc and GpuAcc backend.
  • Maximum operator support for TensorFlow parser.
  • Pad operator support for TensorFlow parser.
  • ExpandDims operator support for TensorFlow parser.
  • Sub operator support for TensorFlow parser.
  • BatchToSpace operator support for GpuAcc backend.
  • StridedSlice operator support for CpuRef, GpuAcc and CpuAcc backend.
  • SpaceToBatchNd operator support for GpuAcc backend. Some padding configuration is currently not interpret correctly
  • Greater operator support for CpuRef, GpuAcc and CpuAcc backend.
  • Greater operator support for TensorFlow parser.
  • Equal operator support for CpuRef backend.
  • Equal operator support for TensorFlow parser.
  • AddN operator support for TensorFlow parser.
  • Split operator support for Tensorflow parser.STRIDED_SLICE
  • Reciprocal of square root (Rsqrt) operator support for CpuRef backend.
  • Mean operator support for TensorFlow parser.
  • ResizeBilinear operator support for CpuAcc backend.
  • Logistic support for TensorFlow Lite parser.
  • Logistic support for GpuAcc backend.
  • Gather operator support for CpuRef backend.
  • Gather operator support for TensorFlow parser.
  • TensorFlow Lite parser support for BatchToSpace operator.
  • TensorFlow Lite parser support for Maximum operator.
  • TensorFlow Lite parser support for Minimum operator.
  • TensorFlow Lite parser support for ResizeBilinear operator.
  • TensorFlow Lite parser support for SpaceToBatch operator.
  • TensorFlow Lite parser support for StridedSlice operator.
  • TensorFlow Lite parser support for Sub operator.
  • TensorFlow Lite parser support for concatenation on tensors with rank other than 4
  • TensorFlow Lite parser support for Detection Post Process.
  • TensorFlow Lite parser support for Reciprocal of square root (Rsqrt).
  • Detection Post Process custom operator Reference implementation added.
  • Support for Serialization / Deserialization of the following ArmNN layers:
    • Activation
    • Addition
    • Constant
    • Convolution2d
    • DepthwiseConvolution2d
    • FullyConnected
    • Multiplication
    • Permute
    • Pooling2d
    • Reshape
    • Softmax
    • SpaceToBatchNd
  • New executable to convert network from TensorFlow Protocol Buffers to ArmNN format
  • New C++ Quantization API, supported layers are:
    • Input
    • Output
    • Addition
    • Activation
    • BatchNormalization
    • FullyConnected
    • Convolution2d
    • DepthwiseConvolution2d
    • Softmax
    • Permute
    • Constant
    • StridedSlice
    • Splitter
    • Pooling2d
    • FullyConnected
    • Reshape
    • eMerger
    • SpaceToBatch
    • ResizeBilinear

Public API Changes:

  • Support for the boolean data types. These are specified as 8-bit unsigned integers where zero (all bits off) represents false and any non-zero value (any bits on) represents true.
  • AddRsqrtLayer() method added to the graph builder API.
  • The profiling event now uses BackendId instead of Compute to identify the backend. BackendId is a wrapper class for the string that identifies a backend, and it is provided by the backend itself, rather than being statically enumerated like Compute.
  • Added the new method OptimizeSubGraph to the backend interface that allows the backends to apply their specific optimizations to a given sub-grah.
  • The old way backends had to provide a list optimizations to the Optimizer (through the GetOptimizations method) is still in place for backward compatibility, but it's now considered deprecated and will be remove in a future release.
  • Added the new interface class INetworkQuantizer for the Quantization API exposing two methods
    OverrideInputRange: allowing the caller to replace the quantization range for a specific input layer
    ExportNetwork: returning the quantized version of the loaded network

Known issues:

  • Large graphs with many branches and joins can take an excessive time to load, or cause a software hang while loading into ArmNN. This issue affects versions of ArmNN from 18.11 onwards. We are continuing to investigate and will fix the problem in a future release. Models known to be affected include Inception v4 and Resnet V2 101.

  • Merge layer with 8-bit quantized data where the tensors to be merged have different quantization parameters does not work on the GpuAcc or CpuAcc backends. This is known to affect quantised Mobilenet-SSD models, and some quantized Mobilenet v2 models.

armnn - Release 18.11

Published by Surmeh almost 6 years ago

New Features:

• Addition support for 8-bit tensors on the GpuAcc backend
• FullyConnected support for 8-bit tensors on the GpuAcc backend
• Division support for the GpuAcc backend.
• Subtraction support for the GpuAcc and CpuAcc backends.
• Arithmetic Mean operator support for the GpuAcc.
• Pad operator support for GpuAcc and CpuRef backends.
• SpaceToBatchNd operator support for CpuRef backend.
• BatchToSpaceNd operator support for CpuRef backend.
• Added support for NHWC Normalization with 'cross channels' method, including CpuRef backend support. NHWC data layout is not yet supported for 'Within channels' normalization method on any backend.
• Added support for NHWC ResizeBilinear for the CpuRef and GpuAcc backends
• Added support for NHWC Convolution2d for the CpuRef and GpuAcc backends.
• Added support for NHWC DepthwiseConvolution.
• Added support for NHWC Pooling2d for the CpuRef, GpuAcc and Neon backends
• Added support for NHWC L2Normalization.
• Added support for NHWC BatchNormalization.
• Added support for Float32 LSTM for CpuRef backend.
• Added CONCATENATION, FULLY_CONNECTED, MAX_POOL_2D, RELU, RELU6, RESHAPE operators support to the TfLite Parser.
• Added Fully Connected Support for 8-bit tensors on the CpuAcc becked.
• Added arbitrary axis support for the Merger Layer.

Public API Changes:

• armnn::Optional helper class was introduced and used in the IsDepthwiseConvolutionSupported(...) and IsConvolution2dSupported(...) functions to represent optional biases
• The IsXXXSupported(...) free functions now take a BackendId instead of the Compute enum. Backward compatibility is maintained through the automatic conversion from the Compute to the BackendId type.
• The Compute enum and the IsXXXSupported(...) free functions are being deprecated in favor of the IBackend and ILayerSupport interfaces, which provide the same functionality in a more flexible and extensible manner. The deprecated functions will be removed in a future release.

Other changes:

• An issue has been fixed where Profiler JSON output would report units of milliseconds but the data was actually in microseconds.

armnn - Release 18.08

Published by TelmoARM about 6 years ago

This release of Arm NN integrates the latest Compute Library and adds improvements to thread-safety, memory consumption and overall performance.

New Features:

  • The amount of system memory needed for a loaded network has been reduced compared to Release 18.05.
  • Support for LSTM operator.
  • Support for 16-bit floating point including:
  • Support for 16-bit floating point weights and bias tensors in ModelBuilder (INetwork) API
  • Optimiser option to automatically convert 32-bit floating point models to 16-bit floating point where supported.
  • Support for computing inference in 16-bit floating point precision.
  • Support for Tensorflow Lite parser including additional operator support for :
    • AVERAGE_POOL_2D
    • CONV_2D
    • DEPTHWISE_CONV_2D
    • SOFTMAX
    • SQUEEZE
  • Support for ONNX parser including additional layer support for:
    • Addition
    • Convolution
    • MatMul
    • Max Pool
    • Constant
    • Relu
    • Reshape
  • More detailed profiling with JSON output format support.
  • Captures CL and Neon kernel level events

Public API Changes:

  • API for creating a Runtime object has changed. It no longer takes an armnn::Compute argument but instead requires a CreationOptions object. (See include/armnn/IRuntime.hpp)
  • The Optimize function now takes an additional 2 parameters (See include/armnn/INetwork.hpp)
  • The backendPreferences which is a vector of compute devices that the user wants to execute the workloads on in preference order. The optimize function will attempt to use the first backend in the list, only falling back to subsequent backends if the first does not support the layer. e.g. a preference list of GpuAcc, CpuAcc will attempt to execute on the Mali GPU, falling back to a v7/v8 ARM CPU if the workload in question is not supported by the GPU
  • (Optional) OptimizerOptions parameter which contains the flag to convert a 32-bit floating point model to 16-bit floating point automatically.

Other changes:

  • This release of ArmNN requires at least release 18.08 of the Compute Library.
  • Fixed an issue where a 4d softmax causes entire network to fail conversion.
  • Fixed ParseFlatbuffersFixture to pass quantized input/output properly
  • Fixed thread-safety of runtime.
  • Fixed Mobilenet caffe model crashing when GpuAcc is selected as compute device
  • Fixed failing NetworkTests when CL support is on but Neon support is off
armnn - Release 18.05.02

Published by Surmeh over 6 years ago

This patch release updates the Arm NN makefiles to allow it to build on both Android O and P.

armnn - Release 18.05.01

Published by Surmeh over 6 years ago

  • Fixed broken links for developer.arm guides
  • Added build guide for ArmNN using Android NDK
armnn - Release 18.05

Published by TelmoARM over 6 years ago

This release of Arm NN integrates the latest Compute Library and adds improvements to thread-safety, memory consumption and overall performance.

New Features:

  • In general, the amount of RAM needed for a loaded network has been reduced by 20-30% compared to release 18.03.
  • The latest 8-bit quantized operations from Compute Library have been integrated. In testing, 8-bit quantized mobilenets models are 3x faster compared to release 18.03.
  • It is now supported to load and unload graphs simultaneously from multiple threads. In other words, the methods IRuntime::LoadNetwork() and IRuntime::UnloadNetwork() are thread-safe.

Public API Changes:

  • IsConvolution2dSupported requires additional TensorInfo arguments describing the output and bias tensors.

Other changes:

  • This release of ArmNN requires at least release 18.05 of the Compute Library.
  • Fixed an issue where pooling operations with different pooling width and height would produce the wrong output.
  • Fixed an issue in the Caffe parser where BatchNormalization would return the wrong results when the rolling average factor was non-zero.
  • Fixed the known issue in 18.03 where the multiplication layer could not support tensors of different shapes.
armnn - Release 18.03

Published by Surmeh over 6 years ago

This release of Arm NN adds armnnTfParser, a library for loading Tensorflow Protobuf files into Arm NN. See src/armnnTfParser.

Other changes:

  • Fixed an issue where memory used by OpenCL objects would increase on every call to LoadNetwork
  • Fixed an issue where the output from a Merger layer being used as input to another Merger layer would cause a runtime error.

Known issues:

  • A MultiplicationLayer with inputs of different sizes will not execute on the CpuAcc (NEON) backend. This is due to a limitation of Compute Library v18.03 which will be fixed in a future release.
armnn - Release 18.02

Published by TelmoARM over 6 years ago

Release 18.02