Bot releases are hidden (Show)
Finally, the Version 1.0 release is here! The software has been stable and ready for production use for quite some time now and after being in beta for about a half a year, we are confident that the current version of the software deserves to mark the first major release of Kernel Tuner.
Version 1.0 integrates a lot of new functionality, including blazing fast search space construction, support for tuning HIP kernels on AMD GPUs, new functionality for mixed precision and accuracy tuning, experimental support for tuning OpenACC programs, a conda package installer for Kernel Tuner, and many more changes and additions.
I would like to thank every one involved in the development of Kernel Tuner of the past years! Special thanks to the Kernel Tuner developers team for their continued support of the project!
PySMT
and ATF
for searchspace buildingsetup.py
and setup.cfg
to pyproject.toml
for centralized metadata, added relevant testspyproject.toml
metadata, minor fixes and changes to be compatible with updated dependenciesOrderedDict
, as all dictionaries in the Python versions used are already orderedFull Changelog: https://github.com/KernelTuner/kernel_tuner/compare/0.4.5...1.0
Published by fjwillemsen 11 months ago
This is a beta release for early access to the new features. Not intended for production use.
The release contains:
Published by fjwillemsen 12 months ago
This is a beta release for early access to the new features. Not intended for production use.
The release contains:
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b4...1.0.0b5
Published by fjwillemsen 12 months ago
This is a beta release for early access to the new features. Not intended for production use.
This release contains several improvements:
nvidia-ml-py
added to tutorial
extra dependencies.Published by fjwillemsen about 1 year ago
This is a beta release for early access to the new features. Not intended for production use.
This version contains several bugfixes:
check_restrictions
function.bayes_opt
would not handle pruned parameters correctly.Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b2...1.0.0b3
Published by fjwillemsen about 1 year ago
This is a beta release for early access to the new features. Not intended for production use.
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b1...1.0.0b2
Published by fjwillemsen about 1 year ago
This is a beta release for early access to the new features. Not intended for production use.
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/0.4.5...1.0.0b1
Published by benvanwerkhoven over 1 year ago
Version 0.4.5 adds support of using PMT in combination with Kernel Tuner enabling power and energy measurements on a wide range of devices. In addition, we have worked extensively on the internals of Kernel Tuner and the interfaces of the separate components that together make up Kernel Tuner. Along with a few bugfixes, fixes of small errors in examples and documentation.
Published by benvanwerkhoven over 1 year ago
Version 0.4.4 adds extended support for energy efficiency tuning. In particular, with the new capability to fit a performance model to the target GPUs power-frequency curve. How to use these features is demonstrated in:
https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/going_green_performance_model.py
And described in the paper:
Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning
R. Schoonhoven, B. Veenboer, B. van Werkhoven, K. J. Batenburg
International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) at Supercomputing (SC22) 2022
https://arxiv.org/abs/2211.07260
Other than that, we've implemented a new output and metadata JSON format that adheres to the 'T4' auto-tuning schema created by the auto-tuning community at the Lorentz Center workshop in March 2022.
From the changelog:
Published by benvanwerkhoven about 2 years ago
The version 0.4.3 release consists of a large number of changes to the internals of Kernel Tuner, including the addition of a new backend based on Nvidia's official Python bindings for CUDA, as well as improved functionality for tuning energy efficiency, e.g. measuring core voltages, the measurement of power and the interface with NVML has also improved a lot.
Some of the changes are also in the "externals" of Kernel Tuner. In the sense that we have migrated from https://github.com/benvanwerkhoven/ to https://github.com/KernelTuner. The goal of this move is to bring the collection of repositories belonging to the larger Kernel Tuner project under one organization.
Published by benvanwerkhoven over 2 years ago
Version 0.4.2 includes a lot of work on the search space representation, application of restrictions, and optimization strategies. In addition to the addition of several new optimization strategies, most optimization strategies should see improved performance both in terms of the number of evaluated kernel configurations as well as execution time.
Published by benvanwerkhoven about 3 years ago
This version adds a brand new Bayesian Optimization strategy, as well as some smaller features and fixes.
Published by benvanwerkhoven over 3 years ago
This version adds a great deal of new functionality and extra flexibility and additional control to the user over what is being benchmarked and when. From the CHANGELOG:
Published by benvanwerkhoven almost 4 years ago
This version adds several new and recent features. Most importantly is the new feature to specify user-defined metrics for Kernel Tuner to compute along with the benchmarking results. User-defined metrics are composable, so you can define metrics that build upon other metrics. The documentation pages have also been updated to include this new feature and other recent changes.
An important change that might influence benchmark results reported by Kernel Tuner is the fact that the runner will now do a warm up of the device using the first kernel in the parameter space. This is to remove any startup or cold start delays that were significantly slowing down the first benchmarked kernel on many devices.
From the changelog:
Published by benvanwerkhoven over 4 years ago
A small release for 2 small new features and a bugfix for older GPUs.
Published by benvanwerkhoven almost 5 years ago
This is the release of version 0.3.0 of Kernel Tuner. We have done a lot of work on the internals of Kernel Tuner. This release fixes several issues, adds and extends new features, and simplifies the user interface.
Published by benvanwerkhoven almost 6 years ago
Version 0.2.0 adds a large number of search optimization algorithms and basic support for testing and tuning Fortran kernels.
Published by benvanwerkhoven over 6 years ago
Published by benvanwerkhoven almost 7 years ago
Version 0.1.8 brings many improvements, mostly focused on user friendliness. The installation process of optional dependencies is simplified as you can now use extras with pip. For example, pip install kernel_tuner[cuda]
can be used to install both Kernel Tuner and the optional dependency PyCuda. In addition, Version 0.1.8 introduces many more checks on the user input that you pass to tune_kernel and run_kernel. For example, the kernel source code is parsed to see if the signature matches the argument list. The additional checks on input should make it easier to use and debug programs using Kernel Tuner. For a more detailed overview of the changes, see below:
Published by benvanwerkhoven almost 7 years ago