compare emulators against eachother
GPL-2.0 License
compare emulators against each other
This project connects several emulators of the 6502 CPU and tests them against each other.
The basic idea here is to initialize the CPU to some random state, and then execute a single, randomly generated, instruction. If we do this the same way on several emulators, we can see if they agree or disagree. If they disagree, then we maybe found a bug and a test case.
But first, of course, we need to clone the repository. This step downloads solid65 itself, and (some of) the emulators it tests. To do this using git:
git clone --recursive https://github.com/omarandlorraine/solid65.git
You may of course get it from some other repository, but don't forget
--recursive
.
Inside of the project directory, you can use ./run_test
to generate a random
test case and run it. The test case appears under the tests/
directory, and
is a directory containing the results of the test. Each tested emulator has a
corresponding file under this directory, detailing the initial state, and each
memory access the emulator does during the course of the execution of the
random instruction.
To analyse the test results, you could make use of the compare.py
script. Its
arguments are a path to the test, and the names of two emulators you want to
compare, and its output is a list of disagreements it found.
For example, run the following command line to see how the rubbermallet emulator compares with mre's in their execution of the RTS instruction:
./compare.py tests/rts/ rubbermallet mre_mos6502
The exit status of the compare.py
script corresponds to the severity of the
mismatch; 6 or lower is good if cycle accuracy is not important.
These two steps with ./run_test
and ./compare.py blah de blah
may of course be
tedious, and may not even find a bug. So, a convenience script comparing two
emulators exists:
./rubbermallet_vs_mre # searches for discrepancies between fake6502 and mre_mos6502, not considering cycle accuracy
Known issues with the tester:
An emulator written in Rust.
Known issues with the tester: