Rogue-Mx is a compiler from Mx** to RISC-V, 32-bit, integer Extended, the project for Compiler Design and Implementation at SJTU. It's implemented mainly in Kotlin/JVM, with the exception that the lexer and parser part contains Java, since antlr 4 is used.
The language reference for Mx** may be found in the project assignment.
Description | Status |
---|---|
ANTLR4 | Completed |
AST | Completed |
Semantic | Completed, pass given test suite |
IR | Completed, pass given test suite using naive codegen |
Codegen | Completed, pass given test suite |
Optimize | Completed, outperform O1, nearly O2 |
GC | I'm thinking peach, it will not be implemented unless I'm spare enough to plant some peaches |
toString()
The test-tool
contains some legacy test tools before the whole pipeline is built.
Refer to project assignment for test cases.
This section keeps track of implemented optimizations
Mem2Reg
was originally implemented for LLVM IR,
and started from the workaround alloca
-load
-store
in LLVM IR format.
Current IR form differs from LLVM IR only in the discard of type system,
so it also starts from alloca
-load
-store
format.
SSA is performed for all and only for all local variables,
thus load
-store
of anything like global variables, class members, etc.
is not optimized in this step.
Also, note that alloca
is not supported in RVTranslator
,
and any other optimization is based on SSA form,
so this step must be executed.
This implementation uses the algorithm from Chapter 19 of Modern Compiler Implementation in C.
Current implementation never eliminates control statements.
Force small function to be inline.
Small means that the number of instruction the result function contains is no more than a specified figure, roughly several thousand.
Non-recursive calls are processed by their instruction number from smaller to bigger. Self-recursive calls are then processed at most three times. Recurse involving multiple functions are not optimized.
Only performed on local variables.
Including operations on string literals.
Requiring DCE after this.
Localize global variables so that other optimizations can have effect on them.
Requiring Mem2Reg again after this.
TODO: load with DomTree, store with liveness analysis
Not enabled, as this is not a commmon method in compilers, ex., do not work for multi-file situation
I finally decide to implement this.
Only performed on ICalc
and Load
.
Actually, it seems to optimize ICmp
and Branch
, together with Phi
,
but it seems non-trivial, so it is not currently implemented.
Performed on Phi
, ICalc
, Call
and Load
.
To build this compiler, just execute
sh gradlew installDist
or, on windows,
gradlew.bat installDist
The result will be installed in build/install/Rogue-Mx
.
JDK 11 is used for development. JDK >= 1.8 should be okay for build.
There is also another version in submodule submit, which can be built offline with local resources in the judge docker. To build it offline, just execute:
sh gradlew installDist --offline
Library of built-in functions currently can only be generated by executing
sh gradlew generateBuiltin
with riscv-gnu-toolchain configured with:
./configure --prefix=/opt/riscv --with-arch=rv32ima --with-abi=ilp32
Exit code | description |
---|---|
1 | Nothing to execute or internal error |
2 | IO failure |
3 | Parser does not accept the code |
4 | AST builder does not accept the code |
5 | Semantic does not accept the code |
The type system of LLVM IR is somehow annoying, so it is deprecated. A new IR is designed, which is also in CFG + SSA form, just because SSA has already been implemented when the decision is made.
It's also tried very hard to keep the compatibility to output IR in LLVM IR form, but it's not easy to add a fake type system, and it's not worth it.
This part contains notes for future development.
grammar.Variable
Variable
constructed for each declaration,equals()
and hashCode()
Variable.def.init
semantic.SemanticMain
return
the main
function inoperator fun invoke(ASTNode.Program)
in the future,TopLevelTranslator
ast.ASTNode
ast.ASTType
ASTNode
cannot be separated due to the limitation of sealed class
(updated 2 years later, in Kotlin 1.4, why Kotlin does not evolve faster)ASTType
is separated from ASTNode
ir.translator.ExpressionTranslator
load
is added for every assignment,riscv.RVInstruction
RVBlock
rather than only the nameJ
and Branch
in the future