Scientific machine learning (SciML) benchmarks, AI for science, and (differential) equation solvers. Covers Julia, Python (PyTorch, Jax), MATLAB, R
MIT License
SciMLBenchmarks.jl holds webpages, pdfs, and notebooks showing the benchmarks for the SciML Scientific Machine Learning Software ecosystem, including:
The SciML Bench suite is made to be a comprehensive open source benchmark from the ground up, covering the methods of computational science and scientific computing all the way to AI for science.
These benchmarks are meant to represent good optimized coding style. Benchmarks are preferred to be run on the provided open benchmarking hardware for full reproducibility (though in some cases, such as with language barriers, this can be difficult). Each benchmark is documented with the compute devices used along with package versions for necessary reproduction. These benchmarks attempt to measure in terms of work-precision efficiency, either timing with an approximately matching the error or building work-precision diagrams for direct comparison of speed at given error tolerances.
If any of the code from any of the languages can be improved, please open a pull request.
To view the results of the SciML Benchmarks, go to benchmarks.sciml.ai. By default, this will lead to the latest tagged version of the benchmarks. To see the in-development version of the benchmarks, go to https://benchmarks.sciml.ai/dev/.
Static outputs in pdf, markdown, and html reside in SciMLBenchmarksOutput.
To cite the SciML Benchmarks, please cite the following:
@article{rackauckas2019confederated,
title={Confederated modular differential equation APIs for accelerated algorithm development and benchmarking},
author={Rackauckas, Christopher and Nie, Qing},
journal={Advances in Engineering Software},
volume={132},
pages={1--6},
year={2019},
publisher={Elsevier}
}
@article{DifferentialEquations.jl-2017,
author = {Rackauckas, Christopher and Nie, Qing},
doi = {10.5334/jors.151},
journal = {The Journal of Open Research Software},
keywords = {Applied Mathematics},
note = {Exported from https://app.dimensions.ai on 2019/05/05},
number = {1},
pages = {},
title = {DifferentialEquations.jl – A Performant and Feature-Rich Ecosystem for Solving Differential Equations in Julia},
url = {https://app.dimensions.ai/details/publication/pub.1085583166 and http://openresearchsoftware.metajnl.com/articles/10.5334/jors.151/galley/245/download/},
volume = {5},
year = {2017}
}
The following is a quick summary of the benchmarks. These paint broad strokes over the set of tested equations and some specific examples may differ.
Vern
methods tend to do the best in every benchmark of this categoryTsit5
does well consistently.dopri5
/dop853
perform very similarly, but are bothVern
methods.CVODE_Adams
and lsoda
, tend to not do very well.ddeabm
does not do as well as the otherRosenbrock23
, lsoda
, and TRBDF2
tend to be the most efficient at highRodas4P
and Rodas5P
tend to be the most efficient at low tolerances.FBDF
and QNDF
do the best at all normal tolerances.TRBDF2
and KenCarp4
can come close.radau
is always the most efficient when tolerances go to the low extreme1e-13
)KenCarp4
) are much more efficient in mostDPRKN
methods are by far the most efficient. The Vern
EM
and RKMil
methodsSRA
and SRI
methods both are very similar within-class on the simpleSRA3
is the most efficient when applicable and the tolerances are low.SRIW1
) generally do well on stiff problems.ImplicitEM
and ImplicitRK
, doTsit5
does well in a large class of problems here.Vern
methods do well in low tolerance cases.Rodas5P
, perform well.To generate the interactive notebooks, first install the SciMLBenchmarks, instantiate the
environment, and then run SciMLBenchmarks.open_notebooks()
. This looks as follows:
]add SciMLBenchmarks#master
]activate SciMLBenchmarks
]instantiate
using SciMLBenchmarks
SciMLBenchmarks.open_notebooks()
The benchmarks will be generated at your pwd()
in a folder called generated_notebooks
.
Note that when running the benchmarks, the packages are not automatically added. Thus you will need to add the packages manually or use the internal Project/Manifest tomls to instantiate the correct packages. This can be done by activating the folder of the benchmarks. For example,
using Pkg
Pkg.activate(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"))
Pkg.instantiate()
will add all of the packages required to run any benchmark in the NonStiffODE
folder.
All of the files are generated from the Weave.jl files in the benchmarks
folder of the SciMLBenchmarks.jl repository. The generation process runs automatically,
and thus one does not necessarily need to test the Weave process locally. Instead, simply open a PR that adds/updates a
file in the benchmarks
folder and the PR will generate the benchmark on demand. Its artifacts can then be inspected in the
Buildkite as described below before merging. Note that it will use the Project.toml and Manifest.toml of the subfolder, so
any changes to dependencies requires that those are updated.
Report any bugs or issues at the SciMLBenchmarks repository.
To see benchmark results before merging, click into the BuildKite, click onto Artifacts, and then investigate the trained results.
All of the files are generated from the Weave.jl files in the benchmarks
folder. To run the generation process, do for example:
]activate SciMLBenchmarks # Get all of the packages
using SciMLBenchmarks
SciMLBenchmarks.weave_file(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"),"linear_wpd.jmd")
To generate all of the files in a folder, for example, run:
SciMLBenchmarks.weave_folder(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"))
To generate all of the notebooks, do:
SciMLBenchmarks.weave_all()
Each of the benchmarks displays the computer characteristics at the bottom of the benchmark. Since performance-necessary computations are normally performed on compute clusters, the official benchmarks use a workstation with an AMD EPYC 7502 32-Core Processor @ 2.50GHz to match the performance characteristics of a standard node in a high performance computing (HPC) cluster or cloud computing setup.
For almost all equations, there is no analytical solution. A low tolerance reference solution is required in order to compute the error. However, there are many questions as to the potential of biasing the results via a reference computed from a given program. If we use a reference solution from Julia, does that make our errors lower?
The answer is no because all of the equation solvers should be convergent to the same solution. Because of this, it does not matter which solver is used to generate the reference solution. However, caution is required to ensure that the reference solution is sufficiently accurate.
Thankfully, there's a very clear indicator of when a reference solution is not sufficiently correct. Because all other methods will be converging to a different solution, there will be a digit of accuracy at which all other solutions stop converging to the reference. If this occurs, all solutions will give a straight line, you can see there here:
In this image (taken from the TransistorAmplifierDAE benchmark), the second Rodas5P and Rodas4 are from a different problem implementation, and you can see they hit lower errors. But all of the others use the same reference solution and seem to "hit a wall" at around 1e-5. This is because the chosen reference solution was only 1e-5 accurate. Changing to a different reference solution makes them all converge:
This shows that all that truly matters is that the chosen reference is sufficiently accurate, and any walling behavior is an indicator that some method in the benchmark set is more accurate than the reference (in which case the benchmark should be updated to use the more accurate reference).