test framework
APACHE-2.0 License
This is a small, experimental parameterized-testing tool.
It is intended to be used in concert with another unit-testing framework (eg.
Catch2) for providing a main()
test harness and for reporting assertion failures.
It combines a mixture of techniques from grammar-based fuzzing, coverage-guided fuzzing, property-based testing, and approval testing.
Grammar
and call Grammar::addRule
to model possible inputs tophotesthesis::Test
that contains your parameterizedTest::getParam()
to extract abstract parameters andValue::match()
those parameters to extract them into concreteTest::check()
, Test::trace()
orTest::track()
to observe variables during the test, and Test::invariant()
Corpus
which is an object that manages a file that will holdTest
subclass given the provided corpus and grammar, thenTest::administer()
.-fsanitize-coverage=inline-8bit-counters
: this will causeSee test/test_photesthesis.cpp.
Photesthesis has a main entrypoint Test::administer
that will perform some
mixture of three tasks:
Initializing a corpus. If you don't have a corpus yet (or rather: any entries in the corpus for the named test you're running) it will build one using a K-path-covering set (https://doi.org/10.1109/ASE.2019.00027) generated from the grammar, run the test on each element of the set, and save the corpus for reuse in the future.
Checking an existing corpus. This re-runs only the (small) set of transcripts stored in the corpus and compares the run's checked (or tracked) variables to those stored in the transcript. Re-run transcripts are written back to the corpus file on disk if anything changed, in a deterministic position and in human-readable, textual form, so that any changed observations can be inspected and approved.
Expanding a corpus by randomly generating new entries from the grammar (up to some depth limit) to attempt to find and record parameter values that trigger new trajectories in the SUT, recording the associated transcripts to the corpus.
The Test::administer
function takes 3 arguments, all optional:
0
by default, and can also bePHOTESTHESIS_EXPANSION_STEPS
. So byK
for the K-paths covering set, which is 3
by default, and canPHOTESTHESIS_KPATH_LENGTH
.3
by default, and canPHOTESTHESIS_RANDOM_DEPTH
.The expected usage is to run with the initial K-paths corpus while designing a unit test, and then run it once with a fairly large expansion-step count to establish a good extended corpus, that you save. Then mostly re-run that saved corpus (quickly) with a zero expansion-step count as part of your typical unit-test runs. Periodically (as a test maintenance task, or in response to major edits) you can re-run with a nonzero expansion-step count to see if there are new uncovered trajectories.
When a parameterized test runs, photesthesis makes observations and records two separate sets of values about the test:
The transcript of the run, which includes the test name, the input
parameter values chosen, and the canonical reference-values of any checked
or tracked
variables (see below). In other words, the values in the
transcript indicate expected outcomes of a given run.
The trajectory of the run, which is a value used to group test runs together into equivalence classes in the corpus. Only one transcript is maintained per trajectory when expanding the corpus. In other words, the values in the trajectory indicate which transcripts are meaningfully different.
Trajectories are hashed into a single uint64_t, which incorporates any
variable marked as trajectory-relevant by the test using the Test::trace
(or
Test::track
) methods, as well as (optionally, using LLVM instrumentation) an
approximate measure of variation in path-coverage counters from the SUT.
Note: if you're using path-coverage, you may not need to record any values with
track
or trace
; but they can still be useful to subdivide trajectory
classes.
Observations made by the test of its own state thus fall into 4 natural categories:
Invariants are those values (like properties in property testing) that
you expect to be invariant over all executions. They are not recorded in
transcripts, not considered part of trajectories, but if expected invariants
are violated photesthesis will call Test::handleInvariantFailure
with
the parameters that caused the violation, and report the triggering inputs
as failures.
Traced values are those that contribute to the trajectory, but that have specific values that aren't of interest to the transcript. For example, the path-coverage approximation is neither informative to a reader nor does it define correctness of a run, but it does differentiate runs. Similarly various hit-rates or other performance metrics might be worth tracing to differentiate runs, but don't define "correctness" per se.
Checked values are those that contribute to the transcript, but not the trajectory. They are values which represent a correctness check but which should not cause equivalence-class splitting for each possible value. For example, if an arithmetic identity holds, it might be worth checking but not tracing, to avoid retaining separate transcripts for every possible value of the identity.
Tracked values are those that are both checked and traced.
Photesthesis is based on abstract grammars. Meaning: it generates parameters
and observes variables of a single C++ type Value
, which is an
S-expression-like type with a few
subtypes that represent booleans, numbers, symbols, and lists.
Convenience match
functions exist for pattern-matching various C++ types
against Value
in order to extract concrete information used to parameterize
your tests, and reusable composite matching rules can be written by extending
the Matcher<T>
type and overriding one of its match
member functions.
Similarly, any concrete value you wish to observe as a variable (see below) you
will need to inject into the Value
abstract domain. Again, there are
convenience methods provided but you might need to write few of your own for
structured types you observe in multiple tests.
There are two reasons for working in the abstract Value
domain:
Additionally, this approach produces a smaller and simpler-to-debug library than one that rests on a lot of complex C++ type-level trickery, and it also produces a natural pattern-matching-conditional idiom for alternating between different variants of a test's behaviour -- this would have to be accomplished with separate booleans or test-feature flags otherwise.
Photesthesis offers no single new technology, but rather a new (and hopefully useful) combination of existing techniques that have not (to this author's knowledge) previously been combined:
Compared to many structured and random input-generation libraries (eg. property-based testers) it incorporates fuzzer-style path-coverage measurement of the SUT.
Compared to many fuzzers and property-based testers, it is grammar driven and therefore can more likely find inputs with syntactic and even deep semantic validity conditions.
Compared to concrete grammar testers, it performs generation in an abstract grammar (for reasons discussed above).
Compared to most fuzzers, it is designed to be embedded in unit test suites and to support a test-writer making fine-grained assertions and observations, not just opaque whole-program crash-fail semantics.
The closest system I am aware of in design is Nautilus but it differs in a few key ways:
This should build on any newish C++17-speaking C++ compiler, but it is only currently tested on clang 11. It's depends on nothing outside its own source tree and the C++ standard library.
Photesthesis is Copyright 2021 Stellar Development Foundation, licensed under the Apahce 2.0 license (ASL-2).
It includes a bundled copy of an xxhash implementation by Stephan Brumme, licensed under ASL-2-compatible, zlib-like terms. The original can be found here: https://github.com/stbrumme/xxhash/blob/master/xxhash64.h
"Photesthesis" means "light sensitivity", which is somewhat thematically related to the streetlight effect in test coverage: "people only search for something where it is easiest to look".
It also contains the substring "test", and is a similar-sounding word to "hypothesis", which is the name of one of the better property-based test tools.