REGex in Rust with EcmaScript Syntax
APACHE-2.0 License
oh no why
regress is a backtracking regular expression engine implemented in Rust, which targets JavaScript regular expression syntax. See the crate documentation for more.
It's fast, Unicode-aware, has few dependencies, and has a big test suite. It makes fewer guarantees than the regex
crate but it enables more syntactic features, such as backreferences and lookaround assertions.
The regress-tool
binary can be used for some fun.
You can see how things get compiled with the dump-phases
cli flag:
> cargo run 'x{3,4}' 'i' --dump-phases
You can run a little benchmark too, for example:
> cargo run --release -- 'abcd' 'i' --bench ~/3200.txt
This was my first Rust program so no doubt there is room for improvement.
There's lots of stuff still missing, maybe you want to contribute?
$1
)std::str::pattern::Pattern
^abc
still perform a string search. We should compute whether the whole regex is anchored, and optimize matching if so..*?
will eagerly compute their maximum match. This doesn't affect correctness but it does mean they may match more than they should.