re-dfa-pm

Perl module re::DFA

Stars
8

Description

This is a study project for regular expression implementation.
Only the most basic regular expression syntax is supported, i.e.,
the closure (*) operator, the alternation (|) operator, and the
concatenation operator.

Installation

make -f meta.mk
perl Makefile.PL
make

Usage:

  • As an NFA generator:

    perl -Ilib script/re2nfa.pl '(a|b)(aa|bb)(a|b)'

    It will generate one image file, NFA.png:

    http://agentzh.org/misc/re/NFA.png

  • As a DFA generator:

    perl -Ilib script/re2dfa.pl '(a|b)(aa|bb)(a|b)'

    It will generate two image files, DFA.png and DFA.min.png, showing the DFA automata and the minimized DFA automata:

    http://agentzh.org/misc/re/DFA.png

    http://agentzh.org/misc/re/DFA.min.png

  • As a regular expression compiler:

    perl -Ilib script/re2pl.pl '(a|b)(aa|bb)(a|b)'

    will generate a Perl subroutine's source code to implement the corresponding minized DFA automata:

    http://agentzh.org/misc/re/sample.pl

    On the other hand, we can also compile regular expressions down to C code:

    perl -Ilib script/re2c.pl '(a|b)(aa|bb)(a|b)'

    The generated C code can be seen here:

    http://agentzh.org/misc/re/sample.c

  • As a regular expression parser:

    perl -Ilib script/re2xml.pl '(a|b)(aa|bb)(a|b)'

    It will output the XML representation for the parsed regular expression's AST (Abstract Syntax Tree) structure:

    http://agentzh.org/misc/re/sample.xml

  • As a regular expression interpreter:

    perl -Ilib script/evalre.pl -c '(a|b)(aa|bb)(a|b)' "abbbf"

    This command will run the regular expression against the text "abbbf" using the C backend (specified by the -c option). It will output

    match: 'abbb'

    The evalre.pl script uses the Inline::C Perl module to run the C code generated by the regular expression compiler with the C backend.

    Also, it's possible to use the Perl backend to run the regular expression, by specifying the -p option:

    perl -Ilib script/evalre.pl -p '(a|b)(aa|bb)(a|b)' "abbbf"