From: Ulya Trofimovich Date: Thu, 19 Nov 2015 13:53:20 +0000 (+0000) Subject: Completed '--skeleton' description. X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=cf0741bcf1e364c492f9b672d1adafbea0a23337;p=re2c Completed '--skeleton' description. --- diff --git a/src/manual/features/skeleton/failures.rst b/src/manual/features/skeleton/failures.rst index 3275b0bd..50ba6ede 100644 --- a/src/manual/features/skeleton/failures.rst +++ b/src/manual/features/skeleton/failures.rst @@ -59,7 +59,7 @@ And so on. Of course, some errors won't be captured by skeleton program: it's not feasible to cover all possible inputs. For example, of all the hex digits ``[0-9a-fA-F]`` re2c uses only ``[09afAF]``: we can mangle the lexer to not to recognize ``[1-8b-eB-E]`` as hex digits and the program won't notice. -However, the chosen values are *edge* values they are tested *extensively* +However, the chosen values are the boundaries of disjoint character ranges and they are tested extensively (see the section about data generation for details). diff --git a/src/manual/features/skeleton/use.rst b/src/manual/features/skeleton/use.rst index d332c20b..c72951ce 100644 --- a/src/manual/features/skeleton/use.rst +++ b/src/manual/features/skeleton/use.rst @@ -4,12 +4,43 @@ Use Code verification ................. +When re2c transforms regular expressions to code, it uses several internal representations: + +1. Regular expressions are parsed and transformed into abstract syntax tree (AST). +2. AST is compiled to bytecode. +3. Bytecode is used to construct deterministic finite automaton (DFA). +4. DFA undergoes some transformations. +5. Transformed DFA is compiled to C/C++ code. + +Skeleton is constructed right after stage 3, before any additional changes have been made to DFA. +Skeleton is in fact a simplified copy of DFA (better suited for deriving data and freed of all irrelevant information). +It is used only to generate ``.input`` and ``.keys`` files. +The rest of skeleton program is the usual re2c-generated code: +``-S, --skeleton`` option only resets environment bindings, but does not affect any code generation decisions. + +Skeleton programs are therefore capable of catching various errors in code generation. +In other words, it is much safer to rebalance nested ``if`` statements, +add some novel dispatch mechanism, apply various code deduplication or inlining optimizations +and otherwise reorganize and tweak the generated code. + Benchmarks .......... +Another direct application of skeleton is benchmarking. +There is no need to construct benchmarks by hand: re2c automatically converts any real-world program +to a ready benchmark and provides input data. +One only needs to measure execution time of the generated program +(and perhaps adjust it by running the main loop multiple times). + Executable specs ................ +Because of the fact that skeleton ignores all non-re2c code (including actions), +skeleton programs don't need any additional code. +One can just sketch a regular expression specification +and immediately get an executable program. + +