]> granicus.if.org Git - re2c/commit
Simplified [-Wmatch-empty-rule] analyses.
authorUlya Trofimovich <skvadrik@gmail.com>
Wed, 17 Feb 2016 14:46:43 +0000 (14:46 +0000)
committerUlya Trofimovich <skvadrik@gmail.com>
Wed, 17 Feb 2016 14:46:43 +0000 (14:46 +0000)
commit7e5550bc2eb5e789432c00fc0b898b020ae14cd6
treee5f33341232615bf976a4143a4d2aee66bfeb8ab
parent3124a7994a677ddb3d0d0272e802e2ce643a828b
Simplified [-Wmatch-empty-rule] analyses.

Before this patch [-Wmatch-empty-rule] was based on:
    - DFA structural analyses (skeleton phase)
    - rule reachability analyses (skeleton phase)

Now it is based on:
    - NFA structural analyses (NFA phase)
    - rule reachability analyses (skeleton phase)

It's much easier to find nullable rules in NFA than in DFA.
The problem with DFA is in rules with trailing context, both
dynamic and especially static (as it leaves no trace in DFA
states). re2c currently treats static context as dynamic, but
it will change soon.

On the other side NFA may give some false positives because of
unreachable rules:
    [^] {}
    ""  {}
infinite rules:
    [^]* {}
or self-shadowing rules:
    [^]?
Reachability analyses in skeleton helps to filter out unreachable
and infinite rules, but not self-shadowing ones.
16 files changed:
re2c/Makefile.am
re2c/src/codegen/emit_dfa.cc
re2c/src/ir/compile.cc
re2c/src/ir/regexp/nullable.cc [new file with mode: 0644]
re2c/src/ir/regexp/regexp.h
re2c/src/ir/regexp/regexp_alt.h
re2c/src/ir/regexp/regexp_cat.h
re2c/src/ir/regexp/regexp_close.h
re2c/src/ir/regexp/regexp_match.h
re2c/src/ir/regexp/regexp_null.h
re2c/src/ir/regexp/regexp_rule.h
re2c/src/ir/skeleton/match_empty.cc [deleted file]
re2c/src/ir/skeleton/skeleton.cc
re2c/src/ir/skeleton/skeleton.h
re2c/src/ir/skeleton/unreachable_nullable.cc [moved from re2c/src/ir/skeleton/unreachable.cc with 55% similarity]
re2c/test/bug57_original.bi--case-insensitive.c