WebAssembly Performance

Explore how to achieve maximum performance in WebAssembly.

Setup

We benchmark the performance by measuring time to complete 10,000 times of two 64 x 64 matrix multiplication with various implementation and flags:

Implementation:

Flags:

Environment:

Hardware
- CPU: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
- RAM: 16 GB
Software
- Linux Kernel: 5.19.1-3-MANJARO
- NodeJS: v16.16.0
- Emscripten: 3.1.18
- GCC: 12.2.0

Here we use mul_mats.js as baseline to compare the speed / time.

Use SIMD intrinsics with -O3 and -msimd128 flags can be 95.8% faster than pure JavaScript implementation. 🎉

To build binaries, you need to install Docker 19.03+ and run:

make

You should find all binaries in dist/ folder.

To run all of them, simply hit:

make run-all

Some of the execution might failed you are NOT using Linux, check Makefile to see how to run a specific case.

exploring SIMD optimization

How to write a very simple JIT compiler

🚀 A fast WebAssembly interpreter and the most universal WASM runtime