A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR
MIT License
#define Size 819000
int sieve (int N) {
int64_t i, k, prime, count, n; char flags[Size];
for (n = 0; n < N; n++) {
count = 0;
for (i = 0; i < Size; i++)
flags[i] = 1;
for (i = 0; i < Size; i++)
if (flags[i]) {
prime = i + i + 3;
for (k = i + prime; k < Size; k += prime)
flags[k] = 0;
count++;
}
}
return count;
}
void ex100 (void) {
printf ("sieve (100) = %d\", sieve (100));
}
m_sieve: module
export sieve
sieve: func i32, i32:N
local i64:iter, i64:count, i64:i, i64:k, i64:prime, i64:temp, i64:flags
alloca flags, 819000
mov iter, 0
loop: bge fin, iter, N
mov count, 0; mov i, 0
loop2: bge fin2, i, 819000
mov u8:(flags, i), 1; add i, i, 1
jmp loop2
fin2: mov i, 0
loop3: bge fin3, i, 819000
beq cont3, u8:(flags,i), 0
add temp, i, i; add prime, temp, 3; add k, i, prime
loop4: bge fin4, k, 819000
mov u8:(flags, k), 0; add k, k, prime
jmp loop4
fin4: add count, count, 1
cont3: add i, i, 1
jmp loop3
fin3: add iter, iter, 1
jmp loop
fin: ret count
endfunc
endmodule
m_ex100: module
format: string "sieve (10) = %d\n"
p_printf: proto p:fmt, i32:result
p_sieve: proto i32, i32:iter
export ex100
import sieve, printf
ex100: func v, 0
local i64:r
call p_sieve, sieve, r, 100
call p_printf, printf, format, r
endfunc
endmodule
func
describes signature of the function (taking 32-bit signedN
which will be local;
string
describes data in form of C string
export
describes the module functions or data which are visible outside the current moduleimport
describes the module functions or data which should be defined in other MIR modulesproto
describes function prototypes. Its syntax is the same as func
syntaxcall
are MIR instruction to call functionsMIR_load_external
m1
and m2
are modulesm_sieve
and m_e100
, func
is function ex100
, sieve
is function sieve
): /* ctx is a context created by MIR_init / MIR_init2 */
MIR_load_module (ctx, m1); MIR_load_module (ctx, m2);
MIR_load_external (ctx, "printf", printf);
MIR_link (ctx, MIR_set_interp_interface, import_resolver);
/* or use MIR_set_gen_interface to generate and use the machine code */
/* or use MIR_set_lazy_gen_interface to generate function code on its 1st call */
/* use MIR_gen (ctx, func) to explicitly generate the function machine code */
MIR_interp (ctx, func, &result, 0); /* zero here is arguments number */
/* or ((void (*) (void)) func->addr) (); to call interpr. or gen. code through the interface */
binfmt_misc
The mir-bin-run
binary is prepared to be used from binfmt_misc
with the
following line (example):
line=:mir:M::MIR::/usr/local/bin/mir-bin-run:P
echo $line > /proc/sys/fs/binfmt_misc/register
Do adapt the mir-bin-run binary path to your system, that is the default one
And run with
c2m your-file.c -o your-file
chmod +x your-file
./your-file your args
The executable is "configurable" with environment variables:
MIR_TYPE
sets the interface for code execution: interp
(for interpretation),jit
(for generation) and lazy
(for lazy generation, default);MIR_LIBS
(colon separated list) defines a list of extra libraries to load;MIR_LIB_DIRS
or LD_LIBRARY_PATH
(colon separated list) defines an extra listDue to the tied nature of
mir-bin-run
withbinfmt_misc
, it may be a bit weird to callmir-bin-run
directly. TheP
flag on the binfmt_misc passes an extra argument with the full path to the MIR binary.
Very short optimization pipeline for speed and light-weight
Only the most valuable optimization usage:
Different optimization levels to tune compilation speed vs generated code performance
SSA form of MIR is used before register allocation
Simplicity of optimizations implementation over extreme generated code performance
More details about full JIT compiler pipeline:
Simplify: lowering MIR
Inline: inlining MIR calls
Build CFG: building Control Flow Graph (basic blocks and CFG edges)
Build SSA: Building Single Static Assignment Form by adding phi nodes and SSA edges to operands
Address Transformation: remove or change MIR ADDR instructions
Global Value Numbering: removing redundant insns through GVN. This includes constant propagation and redundant load eliminations
Copy Propagation: SSA copy propagation and removing redundant extension instructions
Dead store elimination: removing redundant stores
Dead Code Elimination: removing insns with unused outputs
Pressure relief: moving insns to decrease register pressure
SSA combine: combining addresses and compare and branch instruction pairs
Out of SSA: Removing phi nodes and SSA edges
Jump opts: Different jump optimizations
Machinize: run machine-dependent code transforming MIR for calls ABI, 2-op insns, etc
Find Loops: finding natural loops and building loop tree
Build Live Info: calculating live in and live out for the basic blocks
Build Register Conflicts: building conflict matrix for registers involved in moves. It is used for register coalescing
Coalesce: aggressive register coalescing
Register Allocator (RA): priority-based linear scan RA with live range splitting
Build Live Ranges: calculating program point ranges for registers
Assign: fast RA for -O0
or priority-based linear scan RA for -O1
and above
Rewrite: transform MIR according to the assign using reserved hard regs
Combine (code selection): merging data-depended insns into one
Dead Code Elimination: removing insns with unused outputs
Generate Machine Insns: run machine-dependent code creating machine insns
c2m
.mir.h
and mir.c
contain major API code including input/output of MIR binarymir-dlist.h
, mir-mp.h
, mir-varr.h
, mir-bitmap.h
, mir-hash.h
, mir-htab.h
, mir-reduce.h
mir-hash.h
is a general, simple,mir-interp.c
contains code for interpretation of MIR code. It is included in mir.c
mir-gen.h
, mir-gen.c
, mir-gen-x86_64.c
, mir-gen-aarch64.c
, mir-gen-ppc64.c
, mir-gen-s390x.c
,mir-gen-riscv64.c
contain code for MIR JIT compiler
mir-gen-x86_64.c
, mir-gen-aarch64.c
, mir-gen-ppc64.c
, mir-gen-s390x.c
,mir-gen-riscv64.c
is machine dependent code of JIT compilermir-<target>.c
contain simple machine dependent code common for interpreter andmir-<target>.h
contain declarations common for interpreter and JIT compilermir2c/mir2c.h
and mir2c/mir2c.c
contain code for MIR to C compiler. The generated code might be not portablec2mir/c2mir.h
, c2mir/c2mir.c
, c2mir/c2mir-driver.c
, and c2mir/mirc.h
contain code forc2mir/x86_64
and c2mir/aarch64
, c2mir/ppc64
, c2mir/s390x
,c2mir/riscv64
contain correspondingly x86_64, aarch64, ppc64le, s390x, and riscv machine-dependentmir-bin-run.c
contains code for mir-bin-run
described abovemir-bin-driver.c
with b2ctab
utility can be used for portable way to generate binary from MIR binary filesmir-utils
contains different utilities to work with MIR,adt-tests
, mir-tests
, c-tests
, and c-benchmarks
contains code for testing and benchmarking MIR and c2m
make bench
and make test
Intel i5-13600K with 64GB memory under FC37 with GCC-12.3.1
MIR-generator | MIR-interpreter | gcc -O2 | gcc -O0 | |
---|---|---|---|---|
compilation [1] | 1.0 (249us) | 0.09 (22us) | 109 (27.1ms) | 105 (26.1ms) |
execution [2] | 1.0 (1.74s) | 13.7 (23.8s) | 0.92 (1.6s) | 2.28 (3.97s) |
code size [3] | 1.0 (557KB) | 0.43 (240KB) | 58 (32.2MB) | 58 (32.2MB) |
LOC [4] | 1.0 (23.4K) | 0.48 (11.3K) | 103 (2420K) | 103 (2402K) |
[1] is based on wall time of compilation of C sieve code (w/o any include file and with using memory file system for GCC) and the corresponding MIR sieve code by MIR-interpreter and MIR-generator with optimization level 2
[2] is based on the best wall time of 10 runs with used MIR-generator optimization level 2
[3] is based on stripped sizes of cc1 for GCC and MIR core and interpreter or generator for MIR
[4] my estimation based only on files required for x86-64 GNU C compiler and MIR files for minimal program to create and run MIR code
Intel i5-13600K with 64GB memory under FC37 with GCC-12.3.1
c2m -O2 -eg (generator) | c2m -ei (interpreter) | gcc -O2 | gcc -O0 | |
---|---|---|---|---|
compilation [1] | 1.0 (336us) | 1.0 (337us) | 80 (27.1ms) | 77 (26.1ms) |
execution [2] | 1.0 (1.74s) | 13.7 (23.8s) | 0.92 (1.6s) | 2.28 (3.97s) |
code size [3] | 1.0 (961KB) | 1.0 (961KB) | 34 (32.2MB) | 34 (32.2MB) |
LOC [4] | 1.0 (54.8K) | 1.0 (54.8K) | 44 (2420K) | 44 (2420K) |
[1] is based on wall time of compilation of C sieve code (w/o any include file and with using memory file system for GCC)
[2] is based on the best wall time of 10 runs with used MIR-generator optimization level 2
[3] is based on stripped sizes of cc1 for GCC and C2MIR, MIR core, interpreter, and generator for MIR
[4] is based on all source files excluding tests
Here is generated code performance related to GCC -O2 for different C compilers on 15 small C benchmarks (from directory c-benchmarks
) on the same machine where
Average | Geomean | |
---|---|---|
gcc -O2 | 1.00 | 1.00 |
gcc -O0 | 0.63 | 0.57 |
c2m -eg | 0.96 | 0.91 |
c2m -eb | 0.92 | 0.85 |
chibicc | 0.38 | 0.30 |
clang -O2 | 1.12 | 1.09 |
cparser -O3 | 1.02 | 0.98 |
cproc | 0.68 | 0.65 |
lacc -O3 | 0.47 | 0.39 |
pcc -O | 0.80 | 0.78 |
tcc | 0.54 | 0.50 |
emcc -O2/wasmer | 0.60 | 0.55 |
wasi -O2/wasmer cranelift | 0.60 | 0.54 |
wasi -O2/wasmer LLVM | 0.78 | 0.72 |
wasi -O2/wasmer singlepass | 0.45 | 0.36 |
wasi -O2/wasmtime | 0.92 | 0.87 |
c2m
) to another target for 1-2 months