MIOpen

AMD's Machine Intelligence Library

OTHER License

Stars
1K

Bot releases are hidden (Show)

MIOpen - MIOpen v2.18.0

Published by JehandadKhan about 2 years ago

Notes

  • This release announces the deprecation of MIOpen's OpenCL backend and updates the distribution mechanism for MIOpen kernels along with a few other documentation changes.

Changes

  • Deprecate MIOpen's OpenCL backend
  • Integrate Kernel DB files into MIOpen using Git LFS
  • Add and update MIOpen porting guide
  • Fix an issue in the pooling kernels
  • Fix address calculation issue in Image to Column kernel
  • Various performance tuning and updates
  • Fix an issue in the Winograd kernels
  • Enable MIOpen to restrict to deterministic kernels
  • Various other internal improvements and fixes
MIOpen - https://github.com/ROCm/MIOpen/releases/tag/rocm-5.3.0

Published by rocm-ci about 2 years ago

ROCm release v5.3.0

MIOpen - MIOpen v2.16.0

Published by JehandadKhan over 2 years ago

Notes

  • This release includes enhanced support for MI210 and MI250 and various other improvements.

Changes

  • This release consists of various bug fixes and performance improvements
  • Improved support for Navi21
  • Performance improvements via performance database updates
  • Fix various issues in convolution kernels specific to certain ASICs
  • Fix an accuracy issue in reduction kernels
  • Fix an accuracy issue in Batchnormalization kernels
MIOpen - MIOpen v2.14.0

Published by JehandadKhan almost 3 years ago

Notes

  • This release consists of various bug fixes and performance improvements

Changes

  • Improved support for Navi21

  • Performance improvements via performance database updates

  • Fix various issues in convolution kernels specific to certain ASICs

  • Fix an accuracy issue in reduction kernels

  • Fix an accuracy issue in Batchnormalization kernels

MIOpen - MIOpen v2.12.0

Published by JehandadKhan about 3 years ago

Notes

  • This release includes support for Navi21 and various other bug fixes and performance improvements

Changes

  • MIOpen now supports Navi21!! (via MIOpen PRs 973, 780, 764, 740, 739, 677, 660, 653, 493, 498)
  • Fixed a correctness issue with ImplicitGemm algorithm
  • Updated the performance data for new kernel versions
  • Improved MIOpen build time by splitting large kernel header files
  • Fixed an issue in reduction kernels for padded tensors
  • Various other bug fixes and performance improvements
MIOpen - MIOpen v2.11.0

Published by JehandadKhan over 3 years ago

Notes

  • This release contains various bug fixes and performance improvements.

Changes

  • Updates for Target ID features in ROCm stack
  • Correctness fix in Batchnorm kernels
  • Various bug fixes for MIOpenGEMM on the OpenCL backend
  • Various bug fixes in 3x3 assembly kernels
MIOpen - MIOpen v2.10.0

Published by JehandadKhan over 3 years ago

Notes

  • This release contains new reduction operations, Winograd algorithm performance improvements as well as bug fixes. Various host side performance improvements have been added as well.

Changes

  • Added a GPU reference kernel implementation for faster testing.
  • Add TargetID support for new AMD GPU architectures.
  • Implementation of four additional generic tensor reduction operations (AVG, AMAX, NORM1, NORM2).
  • Fixed a bug where Batchnorm would give incorrect results when the product of image height and image width is not a factor of four.
  • Various host side improvements for better find and tuning performance.
  • Added support for AMD Code Object V4.
MIOpen - MIOpen v2.9.0

Published by daniellowell almost 4 years ago

Notes:

  • This release contains implicit GEMM algorithm performance updates and bug fixes. Additional performance improvements have been implement for batch normalization.

Changes:

  • Added new assembly implicit GEMM kernels
  • Added batch normalization optimizations
  • Fixed issue where miopen-hip backend install would not search for rocBLAS dependency
  • Removed missing tunings from previous release cycle
  • Removed deprecated implicit GEMM xDLOPs solvers
  • Removed incorrect error messages from implicit GEMM solvers
  • Disabled ConvAsmBwdWrW3x3 solver for stride > 1 cases
  • Disabled bidirectional multi-pass kernels due to stability issues
MIOpen - MIOpen v2.8.0

Published by daniellowell almost 4 years ago

Notes:

  • This release provides additional bug fixes and support for embedded builds using MIOpen as a static library.

Changes:

  • Fixed workspace size calculation for GEMM group convolutions
  • Fixed performance regression for M/N
  • Fixed issue with faulty compiler option
  • Fixed typo in components dependency variable in CMakeLists.txt
  • Fixed issues with COMgr backed online compilation for HIP kernels
  • Added cmake flag for embedding system databases when building a static library
  • Added a way to disable building MIOpenDriver when building a static library
  • Added CC compiler detection in ROCm environment
  • Known issue: This release may show warnings for "obsolete configs" in the performance database. This can be fixed by rerunning tuning on a specific network; see tuning documentation
MIOpen - MIOpen v2.7.0

Published by daniellowell about 4 years ago

Notes:

  • This release contains a new reduction API; see API documentation for more information. Additional features for embedded builds have been added, and further support for 3D convolutional networks.

Changes:

  • Added additional tunings into performance database
  • Added general reduction API
  • Added cmake flag for embedding binary database into a static MIOpen build
  • Added cmake flag for embedding system find-db text files into static MIOpen build
  • Fixed issue with GEMM workspace size calculation for backwards data convolutions #381
  • Fixed issue with 3D pooling indexing #365
MIOpen - MIOpen v2.6.0

Published by daniellowell about 4 years ago

Notes:

  • This release contains convolution performance improvements, improved multi-threading behavior, and improved stability for half precision convolutions. Initial iteration time has been reduced with the introduction of hybrid find mode. Builds for a static library have been refined for this release.

Changes:

  • Added MIOPEN_FIND_MODE=3 as the new default convolution Find mode; see documentation here for details
  • Added a more runtime-parameterized version of pooling to reduce the number of online compilations
  • Improved the performance of backwards spatial batch normalization for small images
  • Fixed issue with std::logic_error in SQLite deleter #306
  • Fixed issues with half precision stability for convolutions
  • Fixed issues with multi-threaded SQLite database accesses
  • Fixed issues with 3-D convolutions and incorrect parameters
  • Fixed various issues with implicit GEMM static assert failures
  • Removed inactive implicit GEMM convolution solvers
  • Removed SCGEMM convolutional algorithm from MIOpen
MIOpen - MIOpen v2.5.0

Published by daniellowell over 4 years ago

Notes:

  • This release contains convolution performance improvements, various minor fixes and documentation updates.

Changes:

  • Added a script to detect and install appropriate precompiled kernels
  • Added 3D convolution backwards weights implicit GEMM implementation
  • Improve performance of convolution implicit GEMM algorithm
  • Improved database coverage for batch size 1
  • Improved logging and error reporting
  • Improved documentation for debugging with numeric checks
  • Fixed issue with potential infinities and NaNs appearing during low precision training on CNNs
MIOpen - MIOpen v2.4.0

Published by daniellowell over 4 years ago

Notes:

  • This release contains new implementations of 3D convolutions using implicitGEMM, general performance improvements for convolutions, bug fixes, better versioning in directories, integration with the new rocclr, and dropout support in RNNs.

Changes:

  • Added 3D convolutions for the implicitGEMM algorithm in the forward and backward-data passes
  • Added dropout support for RNN layer; e.g., RNN-vanilla, GRU, and LSTM
  • Added support for AMD's rocclr runtime and compiler
  • Improved performance for implicitGEMM and Winograd algorithms
  • Improved database locking
  • Fixed issue with GPU memory segmentation fault on asymmetric padding #142
MIOpen - MIOpen v2.3.0

Published by daniellowell over 4 years ago

Notes:

  • This release contains new implementations of the implicitGEMM and Winograd algorithms, performance improvements for convolutions, further support for 3D convolutional networks, and various bug fixes.

Changes:

  • Added 3D Pooling layers
  • Added backwards data algorithm for implicitGEMM
  • Added GEMM performance improvements via relaxed constraints in rocBLAS-Tensile
  • Added full CO v3 support for all kernels in MIOpen
  • Added new Winograd group convolution kernels
  • Added an API to query MIOpen's version
  • Added parallel compilation in initial convolutional algorithm search; partial solution to #130
  • Added SQLite binary program cache
  • Improved logging across all layers
  • Improved MIOpen's internal design for calling convolutional solvers
  • Fixed various bugs for the implicitGEMM algorithm
MIOpen - MIOpen v2.2.1

Published by daniellowell over 4 years ago

Notes:

  • This release contains bug fixes, documentation updates, and further code object version 3 support

Changes:

  • Added support for multiple ROCm installations
  • Added additional support for code object v3
  • Fixed issue with incorrect LRN calculation #127
  • Fixed incorrect performance database documentation
  • Fixed issue with incorrect workspace calculation in group convolutions
  • Fixed issue with unsupported hardware instructions used with inline assembly
MIOpen - MIOpen v2.2.0

Published by daniellowell almost 5 years ago

Notes:

  • This release contains bug fixes, performance improvements, and expanded applicability for specific convolutional algorithms.
  • MIOpen has posted a citable paper on ArXiv here.
  • An SQLite database has been added to replace the text-based performance database. While the text file still exists, by default SQLite is used over the text-based performance database; see documentation from more details.

Changes:

  • Added per solution algorithm filtering environmental variable for debugging
  • Added SQLite3 database and build dependency. The text-based performance database support is deprecated and will be removed in the next release.
  • Added citation page to documentation pointing to MIOpen's paper
  • Added to the overall documentation
  • Fixed fusion compilation check issue
  • Fixed fusion group convolution warning
  • Improved performance of forward pooling
  • Improved performance of convolutions
  • Improved performance of spatial training batch normalization for some large batch size input configurations
  • Improved applicability of implicit GEMM convolution algorithm
  • Improved performance of calls to miopenConvolutionXXXGetWorkSpaceSize() functions
  • Improved conformance to code object version 3
  • Disabled SCGEMM convolution algorithm by default; this algorithm is deprecated and will be removed in future releases
  • Changed "hip_hcc" to "hip-hcc" for the MIOpen package requirements in CMakeLists.txt
MIOpen - MIOpen v2.1.0

Published by daniellowell about 5 years ago

Notes:

  • This release contains new layers, bug fixes, and a new convolution algorithm.

Changes:

  • Added a dropout layer API for training
  • Added a new SCGEMM algorithm for convolutions
  • Added further support for bfp16 in convolutions
  • Added a docker hub link for MIOpen docker images.
  • Fixed issue with NaN appearing on batch normalization backwards pass in fp16
  • Fixed softmax kernel bug in log mode #112
  • Fixed gfx803 support issue #869
  • Fixed gfx803 kernel issue #117
  • Fixed issue with disabled GEMM #119
  • Improved performance of batch normalization fp16 forward training layers
  • Improved performance of convolutions layers
  • Removed MIOpenGEMM as a requirement for the HIP backend. It is now optional.
MIOpen - MIOpen v2.0.1

Published by daniellowell about 5 years ago

Notes:

  • This release contains bug fixes and performance improvements.
  • Additionally, the convolution algorithm Implicit GEMM is now enabled by default
  • Known issues:
    • Backward propagation for batch normalization in fp16 mode may trigger NaN in some cases
    • Softmax Log mode may produce an incorrect result in back propagation

Changes:

  • Added Winograd multi-pass convolution kernel
  • Fixed issue with hip compiler paths
  • Fixed immediate mode behavior with auto-tuning environment variable
  • Fixed issue with system find-db in-memory cache, the fix enable the cache by default
  • Improved logging
  • Improved how symbols are hidden in the library
  • Updated default behavior to enable implicit GEMM
MIOpen - MIOpen v2.0.0

Published by daniellowell over 5 years ago

Notes:

  • This release contains several new features including an immediate mode for selecting convolutions, bfloat16 support, new layers, modes, and algorithms.
  • MIOpenDriver, a tool for benchmarking and developing kernels is now shipped with MIOpen.
  • BFloat16 now supported in HIP requires an updated rocBLAS as a GEMM backend.
  • Immediate mode API now provides the ability to quickly obtain a convolution kernel.
  • MIOpen now contains HIP source kernels and implements the ImplicitGEMM kernels. This is a new feature and is currently disabled by default. Use the environmental variable "MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=1" to activation this feature. ImplicitGEMM requires an up to date HIP version of at least 1.5.9211.
  • A new "loss" catagory of layers has been added, of which, CTC loss is the first. See the API reference for more details.
  • 2.0 is the last release of active support for gfx803 architectures. In future releases, MIOpen will not actively debug and develop new features specifically for gfx803.
  • System Find-Db in memory cache is disabled by default. Please see build instructions to enable this feature.

Changes:

  • Added support for bfloat16 datatype in convolutions
  • Added softmax channel mode and new softmax version 2 API
  • Added fast / accurate / log softmax algorithms
  • Added new implicit GEMM convolution algorithm for forward and backwards data passes, disabled by default
  • Added int32 datatype support for output tensors in int8 convolutions
  • Added immediate mode for finding the best convolution kernel for a given configuration
  • Added a Find-Db infrastructure which stashes results of find on a user's system
  • Added a shipped System Find-Db containing offline run Find() results
  • Added an additional, faster batch norm assembly kernel for fp16
  • Added CTC loss layer
  • Added MIOpenDriver as a default component in MIOpen's build #34
  • Fixed C compatability for boolean types in C API #103
  • Fixed incorrect calculation in per-activation batch norm backwards pass #104
  • Fixed bug #95 with asm batch norm ISA
  • Fixed IsApplicable bug in Conv3x3Asm for group convolutions
  • Improved performance of 1x1 stride 2 fp32 convolutions in the forward and backwards data passes
  • Improved 3-D convolution stability
  • Improved applicability of direct convolution backwards weights for 2x2, 5x10, and 5x20 filter sizes
  • Improved maintainability in kernels and cpp code
  • Updated rocBLAS minimum version to branch master-rocm-2.6
MIOpen - MIOpen v1.8.1

Published by daniellowell over 5 years ago

Notes:

  • This release contains minor bug fixes and additional performance database improvements.

Changes:

  • Fixed accuracy issue with backwards weights
  • Fixed issue with name parsing for newer architectures
  • Added narrow workaround for 5x10 and 5x20 filter performance regression
  • Improved support in performance database for Radeon VII
Package Rankings
Top 7.62% on Spack.io
Related Projects