]> granicus.if.org Git - re2c/log
re2c
9 years agoFixed '#include's (appied most of 'include-what-you-use' suggestions).
Ulya Trofimovich [Tue, 1 Dec 2015 16:14:49 +0000 (16:14 +0000)]
Fixed '#include's (appied most of 'include-what-you-use' suggestions).

The worst dependency which 'include-what-you-use' fails to see
(and rightly so) is 'src/parse/lex.re' -> 'src/parse/parser.h'.
This dependency is caused by '#include "y.tab.h"' in 'src/parse/lex.re'.

Another ubiquitos issue is 'src/util/c99_stdint.h' ('include-what-you-use'
suggests to substitute it with '<stdint.h>').

And a couple of other dependencies that 'include-what-you-use' fails to see.

9 years agoPrefixed all tokens with 'TOKEN_'.
Ulya Trofimovich [Tue, 1 Dec 2015 12:59:54 +0000 (12:59 +0000)]
Prefixed all tokens with 'TOKEN_'.

Inspired by commit commit c172f266b4b611cb69bde3b46e4be350819cde73.

9 years agoMakefile.am: use 'AM_V_GEN' prefix to report custom rules.
Ulya Trofimovich [Tue, 1 Dec 2015 12:42:42 +0000 (12:42 +0000)]
Makefile.am: use 'AM_V_GEN' prefix to report custom rules.

9 years agorun_tests.sh (with '--skeleton'): clarified message, use generic CC rathen than ...
Ulya Trofimovich [Tue, 1 Dec 2015 12:10:49 +0000 (12:10 +0000)]
run_tests.sh (with '--skeleton'): clarified message, use generic CC rathen than 'gcc'.

9 years agoRenamed tests that contained uppercase letters in file extension.
Ulya Trofimovich [Mon, 30 Nov 2015 22:50:23 +0000 (22:50 +0000)]
Renamed tests that contained uppercase letters in file extension.

We use file extensions to encode re2c options.
Some (short) options are uppercase letters: e.g. '-D', '-F', '-S'.
There also short options for the same lowercase letters: '-d', '-f', '-s'.
This can cause filename collisions on platforms with case-insensitive
file extensions (e.g. Windows and OS X).

See bud #125: "[OS X] git reports changes not staged for commit
in newly cloned repository".

Fix: use long versions for options that uppercase options.
Disallowed uppercase options in 'run_tests.sh'.

9 years agoconfigure.ac: suppress some warnings with '-Weverything'.
Ulya Trofimovich [Mon, 30 Nov 2015 15:22:13 +0000 (15:22 +0000)]
configure.ac: suppress some warnings with '-Weverything'.

9 years ago'-Wundefined-control-flow': fixed patterns ordering, reduced memory consumption.
Ulya Trofimovich [Mon, 30 Nov 2015 12:12:44 +0000 (12:12 +0000)]
'-Wundefined-control-flow': fixed patterns ordering, reduced memory consumption.

The problem with pattern ordering first emerged on FreeBSD-10.2
(I was able to reproduce it with 'CXXFLAGS=-fsanitize=address').
Some tests failed because patterns reported by '-Wundefined-control-flow'
were sorted in different order than expected. This is because
patterns ordering was inconsistent: patterns were compared by length,
(it doesn't work for patterns of equal length). Now first ordering
criterion is length, and second criterion is lexicographical order.

This commit reduces the amount of memory consumed by '-Wundefined-control-flow':
re2c no longer allocates vectors on stack while deep-first-searching skeleton.

This commit also reduces the limit of memory for '-Wundefined-control-flow'
(64Mb edges -> 1Kb edges). Real-world programs rarely need that much.
The limit was so high to acommodate some few artificial tests (with lower
limit these tests cannot find shortest patterns).

This commit also removes the upper bound for the number of faulty patterns
reported by '-Wundefined-control-flow'. This bound was needed by the
artificial tests mentioned above: they produce lots of patterns.
Now these tests are limited with 1Kb of edges anyway.

Note that 1Kb limit is checked after each new pattern is added, so that
at least one pattern will fit in (even if it takes more than 1Kb).

9 years agoRemoved one particularly fat test from test collection.
Ulya Trofimovich [Wed, 25 Nov 2015 07:04:32 +0000 (07:04 +0000)]
Removed one particularly fat test from test collection.

9 years agoSubstitute template class with non-template, as only one specialization is used.
Ulya Trofimovich [Wed, 25 Nov 2015 06:49:29 +0000 (06:49 +0000)]
Substitute template class with non-template, as only one specialization is used.

9 years agoSkeleton data generation: suffix should be multipath as well as prefix.
Ulya Trofimovich [Tue, 24 Nov 2015 17:51:25 +0000 (17:51 +0000)]
Skeleton data generation: suffix should be multipath as well as prefix.

Prefix of current path under construction is a multipath, because prefix
arcs have not been covered yet. Suffix can be a simple path (that is, a
multipath of width 1), because all alternative suffix arcs have already
been covered.

prefix       suffix
_________   _________
...      \ /
--------- o
_________/

But nothing prevents us from alternating suffix arcs also, as long as
suffix remains a single multipath:

_________   _________
...      \ / ...
--------- o ---------
_________/ \_________

The resulting path's width is the maximum of prefix ans suffix width
(hence the growth in size of those tests in which suffix is wider
than prefix), but it only makes a small difference. And the generated
paths are more "variable".

9 years agoSkeleton data generation: cover all edges in 1-byte range (not only range bounds).
Ulya Trofimovich [Tue, 24 Nov 2015 16:36:14 +0000 (16:36 +0000)]
Skeleton data generation: cover all edges in 1-byte range (not only range bounds).

If code units occupy 1 byte, then the generated path cover covers
*all* edges in the original DFA. If the size of code unit exceeds 1 byte,
then only some ~0x100 (or less) range values will be chosen
(including range bounds).

9 years agoSkeleton data generation: dropped exponential algorithm, always use path cover.
Ulya Trofimovich [Tue, 24 Nov 2015 16:09:15 +0000 (16:09 +0000)]
Skeleton data generation: dropped exponential algorithm, always use path cover.

9 years agoRemoved obsolete '__STDC_LIMIT_MACROS' and '__STDC_CONSTANT_MACROS' defines.
Ulya Trofimovich [Sun, 29 Nov 2015 11:38:04 +0000 (11:38 +0000)]
Removed obsolete '__STDC_LIMIT_MACROS' and '__STDC_CONSTANT_MACROS' defines.

These defines were necessary to enable numeric limits definitions
(such as 'UINT32_MAX') in our local version of 'stdint.h' (which is
used on platforms that don't have system header 'stdint.h').

As noted by commit b237daed2095c1e138761fb94a01d53ba2c80c95, this
workaround doesn't work on FreeBSD, so re2c now uses 'numeric_limits.h'.

9 years agoFixed [-Wconversion] warning.
Ulya Trofimovich [Sun, 29 Nov 2015 11:24:48 +0000 (11:24 +0000)]
Fixed [-Wconversion] warning.

Warning was introduced in commit b237daed2095c1e138761fb94a01d53ba2c80c95:
compiler fails to recognise (or deliberately choses not to recognize)
'std::numeric_limits<...>::max()' as a special constant.

9 years agorun_tests.sh: use '--no-version --no-generation-date' instead of sed hack.
Ulya Trofimovich [Sun, 29 Nov 2015 11:04:56 +0000 (11:04 +0000)]
run_tests.sh: use '--no-version --no-generation-date' instead of sed hack.

These options make re2c omit version and date info and thus produce
stable test results.

9 years agoAdded option '--no-version' that omits version in fingerprint.
Ulya Trofimovich [Sun, 29 Nov 2015 10:57:47 +0000 (10:57 +0000)]
Added option '--no-version' that omits version in fingerprint.

9 years agoGet rid of UINT32_MAX and friends 124/head
Sergei Trofimovich [Sat, 28 Nov 2015 18:11:58 +0000 (18:11 +0000)]
Get rid of UINT32_MAX and friends

UINT32_MAX is conditionally defined only
for C compiler on FreeBSD but not for C++,

Stop using __STDC_LIMIT_MACROS workaround
as it does not work on FreeBSD.

Use std::numeric_limits<> from C++98 instead.

Signed-off-by: Sergei Trofimovich <siarheit@google.com>
9 years agoFixed crashes of 'ostream& operator<< (ostream& os, const char* s)' on NULL.
Ulya Trofimovich [Sat, 28 Nov 2015 17:31:56 +0000 (17:31 +0000)]
Fixed crashes of 'ostream& operator<< (ostream& os, const char* s)' on NULL.

Crashes observed on platforms OS X (clang-7.0.0) and FreeBSD-10.2 (clang-3.4).
First reported in bug #122 "clang does not compile re2c 0.15.x".

What caused NULL passed to 'operator <<': re2c always generates content of
header file (regardless of '-t --type-header' option), but the content is
dumped to file (and header filename initialized to non-NULL) only if the
option was enabled.

Fix: always initialize header filename to non-NULL string.

9 years agorun_tests.sh: use '/usr/bin/env bash' to locate bash.
Ulya Trofimovich [Sat, 28 Nov 2015 15:44:04 +0000 (15:44 +0000)]
run_tests.sh: use '/usr/bin/env bash' to locate bash.

9 years agoMakefile.am: use '=' instead of '==' to compare strings.
Ulya Trofimovich [Sat, 28 Nov 2015 15:39:56 +0000 (15:39 +0000)]
Makefile.am: use '=' instead of '==' to compare strings.

'==' appears to be a bash feature.

9 years agoDon't use overloaded constructors with integral types.
Ulya Trofimovich [Sat, 28 Nov 2015 11:36:41 +0000 (11:36 +0000)]
Don't use overloaded constructors with integral types.

This causes ambiguity in overload resolution on OS X:

    src/codegen/skeleton/generate_data.cc:308:30: error: ambiguous conversion for functional-style cast from 'const size_t' (aka 'const unsigned long') to 'Node::covers_t'
          (aka 'u32lim_t<1024 * 1024 * 1024>')
            const Node::covers_t size = Node::covers_t (len) * Node::covers_t (count);
                                        ^~~~~~~~~~~~~~~~~~~
    ./src/util/u32lim.h:20:11: note: candidate constructor
            explicit u32lim_t (uint32_t x)
                     ^
    ./src/util/u32lim.h:23:11: note: candidate constructor
            explicit u32lim_t (uint64_t x)

Use static constructor-like methods with expliit names.

9 years agoFix "CODE" symbol collision on OS X (see #122)
Oleksii Taran [Sat, 28 Nov 2015 04:08:09 +0000 (20:08 -0800)]
Fix "CODE" symbol collision on OS X (see #122)

On OS X bison generates token enums as CPP macro
constants (y.tab.h):
    #define CODE 260
while on my box it's
   enum yytokentype {
     ...
     CODE = 260,
     ...
   };

That #define causes symbol collision as:

    ../src/parse/lex.re:169:38: error: expected unqualified-id
                                            else if (opts->target == opt_t::CODE)
                                                                            ^
    src/parse/y.tab.h:58:14: note: expanded from macro 'CODE'
    #define CODE 260

Renamed enum entry to TOKEN_CODE.

9 years agoAllowed chaining for all 'OutputFile' methods; renamed them in a uniform way.
Ulya Trofimovich [Fri, 27 Nov 2015 14:29:16 +0000 (14:29 +0000)]
Allowed chaining for all 'OutputFile' methods; renamed them in a uniform way.

9 years agoUse local re2c (in '$(top_bulddir)') rather than system re2c for 'make bootstrap'.
Ulya Trofimovich [Fri, 27 Nov 2015 13:58:29 +0000 (13:58 +0000)]
Use local re2c (in '$(top_bulddir)') rather than system re2c for 'make bootstrap'.

Correct behaviour was broken by commit 38f526d04415adb7b5e6bca228fc26409833f5c3.

9 years agoDon't use 'operator <<' overloads with integral types: resolution is platform-dependent.
Ulya Trofimovich [Fri, 27 Nov 2015 13:41:42 +0000 (13:41 +0000)]
Don't use 'operator <<' overloads with integral types: resolution is platform-dependent.

See bug #122 "clang does not compile re2c 0.15.x".

Example of error on Mac OS X:
    src/codegen/emit_dfa.cc:250:65: error: use of overloaded operator '<<' is ambiguous (with operand types 're2c::OutputFile' and 'const size_t'
          (aka 'const unsigned long'))
            o << indent(ind++) << "static void *" << opts->yyctable << "[" << conds << "] = {\n";
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~
    ./src/codegen/output.h:84:22: note: candidate function
            friend OutputFile & operator << (OutputFile & o, char c);
                                ^
    ./src/codegen/output.h:85:22: note: candidate function
            friend OutputFile & operator << (OutputFile & o, uint32_t n);
                                ^
    ./src/codegen/output.h:86:22: note: candidate function
            friend OutputFile & operator << (OutputFile & o, uint64_t n);
                            ^

On OS X 'size_t' is neither 'uint32_t' nor 'uint64_t', resolution is therefore ambiguous.

9 years agoRelease 0.15.2. 0.15.2
Ulya Trofimovich [Mon, 23 Nov 2015 21:20:12 +0000 (21:20 +0000)]
Release 0.15.2.

9 years agoPrepare release 0-15.2: updated CHANGELOG.
Ulya Trofimovich [Mon, 23 Nov 2015 21:15:38 +0000 (21:15 +0000)]
Prepare release 0-15.2: updated CHANGELOG.

9 years agoMakefile.am: lexer dependends on bison-generated parser; fixed rule order.
Ulya Trofimovich [Mon, 23 Nov 2015 21:11:19 +0000 (21:11 +0000)]
Makefile.am: lexer dependends on bison-generated parser; fixed rule order.

9 years agoRelease 0.15.1. 0.15.1
Ulya Trofimovich [Sun, 22 Nov 2015 21:03:29 +0000 (21:03 +0000)]
Release 0.15.1.

9 years agoPrepare release 0.15.1: updated CHANGELOG.
Ulya Trofimovich [Sun, 22 Nov 2015 20:59:29 +0000 (20:59 +0000)]
Prepare release 0.15.1: updated CHANGELOG.

9 years agorun_tests.sh: fix the order of files in test results.
Ulya Trofimovich [Sun, 22 Nov 2015 20:55:04 +0000 (20:55 +0000)]
run_tests.sh: fix the order of files in test results.

'sort' behavior depends on current locale; set 'LC_ALL=C LANG=C'
before doing locale-sensitive things. Updated test results.

9 years agorelease.sh: don't forget to push tags.
Ulya Trofimovich [Sun, 22 Nov 2015 20:50:15 +0000 (20:50 +0000)]
release.sh: don't forget to push tags.

9 years agoRelease 0.15. 0.15
Ulya Trofimovich [Sun, 22 Nov 2015 19:53:04 +0000 (19:53 +0000)]
Release 0.15.

9 years agoPrepare release 0.15: updated release instructions.
Ulya Trofimovich [Sun, 22 Nov 2015 19:48:37 +0000 (19:48 +0000)]
Prepare release 0.15: updated release instructions.

9 years agoPrepare release 0.15: updated CHANGELOG.
Ulya Trofimovich [Sun, 22 Nov 2015 19:46:45 +0000 (19:46 +0000)]
Prepare release 0.15: updated CHANGELOG.

9 years agoUse 'rst2man.py' to build manpage; updated manpage.
Ulya Trofimovich [Sun, 22 Nov 2015 19:42:21 +0000 (19:42 +0000)]
Use 'rst2man.py' to build manpage; updated manpage.

9 years agoMerge branch 'master' into simplified_codegen.
Ulya Trofimovich [Sat, 21 Nov 2015 20:03:10 +0000 (20:03 +0000)]
Merge branch 'master' into simplified_codegen.

* master:
  Updated version to 0.14.4.dev
  Release 0.14.3.
  Added simple test for yacc-style brackets (see patch #27)
  Fixed '#27 re2c crashes reading files containing %{ %}' (patch by Rui)
  Makefile.am: dropped distfiles for MSVC (they are broken anyway)
  Added full another test for bug #57.
  Updated version to 0.14.3.dev
  Release 0.14.2.
  Fixed bug #57: Wrong result only if another rule is present
  Updated version to 0.14.2.dev
  Release 0.14.1.
  Pad version with '0' instead of nulls

9 years agoSkeleton: data generation (linear): don't forget to dump path in end nodes.
Ulya Trofimovich [Wed, 18 Nov 2015 14:45:49 +0000 (14:45 +0000)]
Skeleton: data generation (linear): don't forget to dump path in end nodes.

9 years agoSkeleton: changed formatting of the generated code (no significant changes).
Ulya Trofimovich [Mon, 16 Nov 2015 14:48:53 +0000 (14:48 +0000)]
Skeleton: changed formatting of the generated code (no significant changes).

9 years agoSkeleton: disregard default rule when estimating maximum rule size (in bytes).
Ulya Trofimovich [Mon, 16 Nov 2015 14:10:49 +0000 (14:10 +0000)]
Skeleton: disregard default rule when estimating maximum rule size (in bytes).

Default rule '*' (not to be confused with 'none' rule) used to have
normal number just like other rules. Now that re2c has to distinguish
default rule fro other rules (because of [-Wunreachable-rules]),
it reserves a special number (UINT32_MAX - 1) for it.

9 years agoLex strings and character classes in a more elegant way.
Ulya Trofimovich [Tue, 10 Nov 2015 15:28:28 +0000 (15:28 +0000)]
Lex strings and character classes in a more elegant way.

9 years agoRecognize escaped dash '\-' in character class.
Ulya Trofimovich [Mon, 9 Nov 2015 16:06:40 +0000 (16:06 +0000)]
Recognize escaped dash '\-' in character class.

9 years agoFixed tests for bug #119: "-f with -b/-g generates incorrect dispatch on fill labels".
Ulya Trofimovich [Fri, 16 Oct 2015 12:21:50 +0000 (13:21 +0100)]
Fixed tests for bug #119: "-f with -b/-g generates incorrect dispatch on fill labels".

Somehow configuration 're2c:state:abort = 1;' was present in all the
tests; it was meant to be only in half of them.

9 years agorun_tests.sh: tried to clarify regexp that splits options from filename.
Ulya Trofimovich [Thu, 15 Oct 2015 13:54:17 +0000 (14:54 +0100)]
run_tests.sh: tried to clarify regexp that splits options from filename.

Note: should keep to POSIX, so no '+' or '?' is allowed.

9 years agorun_tests.sh: run each test in a separate directory and paste all generated files...
Ulya Trofimovich [Wed, 14 Oct 2015 22:11:01 +0000 (23:11 +0100)]
run_tests.sh: run each test in a separate directory and paste all generated files into one.

Updated tests: changes are insignificant (the order in which multiple
generated files are concatenated has changed).

9 years agorun_tests.sh: don't change filenames to '<stdout>'.
Ulya Trofimovich [Wed, 14 Oct 2015 14:09:41 +0000 (15:09 +0100)]
run_tests.sh: don't change filenames to '<stdout>'.

Updated test. Used the following shell script to validate changes:

    #!/bin/bash

    for f2 in *.temp
    do
        f1=${f2%.temp}

        diff1=`diff $f1 $f2 | grep '^< ' | wc -l`
        diff1_fname=`diff $f1 $f2 | grep '^<\( #line [0-9]\+ "<stdout>"\|[ ]\+("<stdout>[^"]\+"\)' | wc -l`
        diff2=`diff $f1 $f2 | grep '^> ' | wc -l`
        diff2_fname=`diff $f1 $f2 | grep '^>\( #line [0-9]\+ "[^"]\+"\|[ ]\+("[^"]\+"\)' | wc -l`

        # missing: only changed filenames
        [ $diff1 -ne $diff1_fname ] && echo "FAIL1: $f1" && exit 1

        # added: only changed filenames
        [ $diff2 -ne $diff2_fname ] && echo "FAIL2: $f1" && exit 1

        # the number of missing changed filenames
        # equals to the number of added changed filenames
        [ $diff1_fname -ne $diff2_fname ] && echo "FAIL4: $f1" && exit 1
    done

    echo "OK"

9 years agorun_tests.sh: paste type headers into source file and diff all at once.
Ulya Trofimovich [Wed, 14 Oct 2015 13:19:55 +0000 (14:19 +0100)]
run_tests.sh: paste type headers into source file and diff all at once.

9 years agorun_tests.sh: use '-o' option. Added tests for '--skeleton' option.
Ulya Trofimovich [Wed, 14 Oct 2015 12:04:19 +0000 (13:04 +0100)]
run_tests.sh: use '-o' option. Added tests for '--skeleton' option.

9 years agoOmit unnecessary null pointer check (suggested by Markus Elfring).
Ulya Trofimovich [Tue, 13 Oct 2015 20:20:22 +0000 (21:20 +0100)]
Omit unnecessary null pointer check (suggested by Markus Elfring).

9 years agorun_tests.sh: added option '--keep-tmp-files'.
Ulya Trofimovich [Tue, 13 Oct 2015 14:36:44 +0000 (15:36 +0100)]
run_tests.sh: added option '--keep-tmp-files'.

9 years agorun_tests.sh: added '--skeleton' option.
Ulya Trofimovich [Tue, 13 Oct 2015 13:26:33 +0000 (14:26 +0100)]
run_tests.sh: added '--skeleton' option.

With this option script runs re2c with '--skeleton' and
'-Werror-undefined-control-flow' and instead of comparing results with
reference test results, it compiles the generated skeleton programs and
runs them. If C compiler or binary return nonzero error status, script
reports an error. Note that cases when re2c failed to generate code are
not considered errors (re2c has lots of test cases for its errors).

9 years agoSplit main lexer and configuration lexer in two separate files.
Ulya Trofimovich [Mon, 12 Oct 2015 14:14:16 +0000 (15:14 +0100)]
Split main lexer and configuration lexer in two separate files.

9 years agoFactored out some common lexing pieces into separate routines.
Ulya Trofimovich [Mon, 12 Oct 2015 13:12:11 +0000 (14:12 +0100)]
Factored out some common lexing pieces into separate routines.

re2c lacks submatch extraction; it would be much more convenient
to memorize input positions for some parts of regular expressions
than break each regexp in the middle and move parts to separate blocks.

Submatch extraction is dificult to implement in general, but supporting
submatch in some simple cases (like the case where trailing context is
allowed) would be not so difficult and most helpful.

9 years agoParse inplace configurations in lexer; don't pass them to parser.
Ulya Trofimovich [Mon, 12 Oct 2015 12:32:45 +0000 (13:32 +0100)]
Parse inplace configurations in lexer; don't pass them to parser.

This removes a lot of copy-pasting.

The change of error location in test is insignificant: the reported
location was incorrect and it still remains imprecise.

9 years agoImproved '-Wmatch-empty-string' warning.
Ulya Trofimovich [Thu, 8 Oct 2015 13:38:19 +0000 (14:38 +0100)]
Improved '-Wmatch-empty-string' warning.

- recognize empty match with nonempty trailing context
- don't report unreachable empty match

9 years agoAdded '-Wunreachable-rules' warning.
Ulya Trofimovich [Thu, 8 Oct 2015 09:51:08 +0000 (10:51 +0100)]
Added '-Wunreachable-rules' warning.

Warns about unreachable rules:
  - rules that are shadowed by other rules, e.g. rule '[a]' is shadowed by
    '[a] [^]'
  - infinite rules that consume infinitely many characters and fail on
    YYFILL, e.g. '[^]*'
  - rules that contain never-matching link, e.g. '[]' with option
    '--empty-class match-none'
default rule '*' should not be reported

9 years agoFixed memleaks and grouped options in one big macro.
Ulya Trofimovich [Wed, 7 Oct 2015 17:30:19 +0000 (18:30 +0100)]
Fixed memleaks and grouped options in one big macro.

9 years agoMerge default rules on the fly, assign them the same lowest priority.
Ulya Trofimovich [Wed, 7 Oct 2015 15:20:38 +0000 (16:20 +0100)]
Merge default rules on the fly, assign them the same lowest priority.

re2c used to postpone merging default rules because rank counter could
only assign consequtive ranks to rules, and default rules must have
the lowest priority. Now rank counter has been modified to return
special value as defult rule rank.

9 years agoAutogenerated configuration tests: added default rule to each test.
Ulya Trofimovich [Mon, 5 Oct 2015 20:20:23 +0000 (21:20 +0100)]
Autogenerated configuration tests: added default rule to each test.

It's not a bunch of unnecessary warnings I want to avoid, it's a bunch of
unnecessary runtime failures in programs generated with '--skeleton'
(failures caused by undefined control flow; re2c recogizes such cases
and the generated program reports a warning before failing).

9 years agoSupport trailing context with '--skeleton'.
Ulya Trofimovich [Mon, 5 Oct 2015 14:44:50 +0000 (15:44 +0100)]
Support trailing context with '--skeleton'.

Trialing contexts are currently broken (overlapping trailing contexts
cannot be tracked with a single 'YYCTXMARKER'). For now, re2c with
'--skeleton' mimics this incorrect behaviour: information about context
is lost by the time DFA is constructed, so skeleton has no way to
figure out the right order of things.

9 years agoMoved path-combining magic closer to path definition.
Ulya Trofimovich [Sun, 4 Oct 2015 18:46:34 +0000 (19:46 +0100)]
Moved path-combining magic closer to path definition.

9 years agoFixed bug #116: "empty string with non-empty trailing context consumes code units".
Ulya Trofimovich [Sun, 4 Oct 2015 18:25:00 +0000 (19:25 +0100)]
Fixed bug #116: "empty string with non-empty trailing context consumes code units".

Prior to this commit backup of trailing context position was done
before advancing input position and re2c either had to emit
    YYCTXMARKER = YYCURSOR + 1;
(with default input API), or
    YYRESTORECTX ();
    YYSKIP ();
(with custom input API).

The problem is that sometimes initial state doesn't sdvance input position
at all. Now re2c emits context backup after advancing input position and it
no longer needs '+1' or 'YYSKIP' hacks. It always backups the correct position.

9 years ago'--skeleton': don't forget to jump to start label when needed.
Ulya Trofimovich [Wed, 30 Sep 2015 16:33:58 +0000 (17:33 +0100)]
'--skeleton': don't forget to jump to start label when needed.

9 years ago'--skeleton': give more info when reporting unused data and keys.
Ulya Trofimovich [Wed, 30 Sep 2015 16:04:29 +0000 (17:04 +0100)]
'--skeleton': give more info when reporting unused data and keys.

9 years ago'--skeleton': fixed codegen error with '-b' (don't forget last bitmap element).
Ulya Trofimovich [Wed, 30 Sep 2015 15:12:27 +0000 (16:12 +0100)]
'--skeleton': fixed codegen error with '-b' (don't forget last bitmap element).

9 years ago'--skeleton': added missing newline in the generated code.
Ulya Trofimovich [Wed, 30 Sep 2015 15:11:45 +0000 (16:11 +0100)]
'--skeleton': added missing newline in the generated code.

9 years ago'--skeleton': tell function name when reporting errors and warnings.
Ulya Trofimovich [Wed, 30 Sep 2015 15:10:38 +0000 (16:10 +0100)]
'--skeleton': tell function name when reporting errors and warnings.

9 years ago'--skeleton': respect empty string match.
Ulya Trofimovich [Tue, 29 Sep 2015 15:47:25 +0000 (16:47 +0100)]
'--skeleton': respect empty string match.

9 years agoFixed skeleton generation in '-r' mode.
Ulya Trofimovich [Tue, 29 Sep 2015 15:04:27 +0000 (16:04 +0100)]
Fixed skeleton generation in '-r' mode.

'-r' is different from normal mode in two aspects:
    - single DFA may be used multiple times (unchanged, we only
      need a single copy for skeleton)
    - DFA may be generated but not used at all

9 years agoSplit skeleton arc count limits for permutations, cover and default paths.
Ulya Trofimovich [Mon, 28 Sep 2015 21:34:06 +0000 (22:34 +0100)]
Split skeleton arc count limits for permutations, cover and default paths.

9 years agoDocs: updated descriptions of some inplace configurations.
Ulya Trofimovich [Mon, 28 Sep 2015 14:30:20 +0000 (15:30 +0100)]
Docs: updated descriptions of some inplace configurations.

9 years agoUnified meaning and mutual relations of some inplace configurations.
Ulya Trofimovich [Mon, 28 Sep 2015 12:54:26 +0000 (13:54 +0100)]
Unified meaning and mutual relations of some inplace configurations.

This commit changes the behaviour of three groups of options:

    re2c:define:YYSETCONDITION
    re2c:define:YYSETCONDITION@cond
    re2c:define:YYSETCONDITION:naked

    re2c:define:YYSETSTATE
    re2c:define:YYSETSTATE@state
    re2c:define:YYSETSTATE:naked (added by this commit)

    re2c:define:YYFILL
    re2c:define:YYFILL@len
    re2c:yyfill:parameter
    re2c:define:YYFILL:naked

The changes should be backwards compatible (meaning that old code that
compiled should still compile), but it may add empty statements or statements
with no effect for some configurations, e.g.:
    YYSETCONDTITION(0);(0);
These changes were necessary to unify re2c behaviour, remove counter-intuitive
cases and make it possible to write comprehensible option descriptions.

In short, the changes are:
    - 'naked' triggers generation of argument-in-braces and semicolon;
    - 'parameter' triggers generation of argument-in-braces (when applicable,
      'naked' has priority over 'parameter');
    - argument templates ('@cond', '@state', '@len') don't force other
      configurations, they also don't influence on argument-in-braces;

Added test generator and autogenerated tests.

9 years agorun_tests.sh: preserve nested directory structure when dumping errors.
Ulya Trofimovich [Mon, 28 Sep 2015 12:49:05 +0000 (13:49 +0100)]
run_tests.sh: preserve nested directory structure when dumping errors.

9 years agoDon't hang forever trying to replace empty configuration arguments.
Ulya Trofimovich [Sun, 27 Sep 2015 11:03:03 +0000 (12:03 +0100)]
Don't hang forever trying to replace empty configuration arguments.

9 years agoDefault options should be syncronized as well.
Ulya Trofimovich [Sun, 27 Sep 2015 10:58:40 +0000 (11:58 +0100)]
Default options should be syncronized as well.

9 years agoReduced redundant global flags.
Ulya Trofimovich [Thu, 24 Sep 2015 14:21:45 +0000 (15:21 +0100)]
Reduced redundant global flags.

9 years agoHandle all inplace configurations in a uniform way.
Ulya Trofimovich [Thu, 24 Sep 2015 13:52:24 +0000 (14:52 +0100)]
Handle all inplace configurations in a uniform way.

This commit removes check (and error) for overwritten configurations
(like setting 're2c:define:YYCYRSOR' twice in the same block).
This check was in principle useful, but it was applied to somehow
randomly chosen set of parameters. If in future we'll feel a need
for such check, it should respect all options equally and report
warning rather than error.

9 years agoAutomatically resync options on read acccess (if they have been updated).
Ulya Trofimovich [Wed, 23 Sep 2015 21:08:39 +0000 (22:08 +0100)]
Automatically resync options on read acccess (if they have been updated).

9 years agoMerged 'DFlag' and 'flag_skeleton' into one option 'target'.
Ulya Trofimovich [Wed, 23 Sep 2015 16:32:36 +0000 (17:32 +0100)]
Merged 'DFlag' and 'flag_skeleton' into one option 'target'.

The nature of these options makes them mutually exclusive; so instead
of checking that they are not both set just make them a single option.

9 years agoSeparated user config and effective config.
Ulya Trofimovich [Wed, 23 Sep 2015 14:45:54 +0000 (15:45 +0100)]
Separated user config and effective config.

9 years agoPrepare to separate user config and effective config.
Ulya Trofimovich [Tue, 22 Sep 2015 12:15:31 +0000 (13:15 +0100)]
Prepare to separate user config and effective config.

9 years agoKeep name table together with other options.
Ulya Trofimovich [Mon, 21 Sep 2015 21:14:45 +0000 (22:14 +0100)]
Keep name table together with other options.

9 years agoGrouped options together in a struct.
Ulya Trofimovich [Mon, 21 Sep 2015 20:50:55 +0000 (21:50 +0100)]
Grouped options together in a struct.

9 years agoDocumentation: added warning descriptions to manpage and online manual.
Ulya Trofimovich [Thu, 17 Sep 2015 14:16:38 +0000 (15:16 +0100)]
Documentation: added warning descriptions to manpage and online manual.

9 years agoDocumentation: added warning descriptions to '-h, -?, --help' option.
Ulya Trofimovich [Thu, 17 Sep 2015 14:03:24 +0000 (15:03 +0100)]
Documentation: added warning descriptions to '-h, -?, --help' option.

9 years agoRemoved unused method declaration.
Ulya Trofimovich [Thu, 17 Sep 2015 12:55:46 +0000 (13:55 +0100)]
Removed unused method declaration.

9 years agoGather some DFA statistics and use it to omit unused code with '--skeleton'.
Ulya Trofimovich [Thu, 17 Sep 2015 12:52:09 +0000 (13:52 +0100)]
Gather some DFA statistics and use it to omit unused code with '--skeleton'.

9 years agoOmit usseless 'yyaccept' variable in '--skeleton' programs.
Ulya Trofimovich [Thu, 17 Sep 2015 09:53:35 +0000 (10:53 +0100)]
Omit usseless 'yyaccept' variable in '--skeleton' programs.

Normally re2c generates single 'yyaccept' variable for all conditions.
With '--skeleton' re2c handles conditions separately, so each condition
needs (or needs not) its own 'yyaccept'.

Prior to this commit re2c used the same criterion to determine if
'yyaccept' is needed with '--skeleton' as it uses generally: whether
'yyaccept' was used in any of conditions. Now re2c looks if 'yyaccept'
was used with this particular condition.

9 years agoGenerate better code with '--skeleton': C90-compliant, free resources on errors.
Ulya Trofimovich [Thu, 17 Sep 2015 09:27:57 +0000 (10:27 +0100)]
Generate better code with '--skeleton': C90-compliant, free resources on errors.

9 years agoCheck 'fread' return value in program generated with '--skeleton'.
Ulya Trofimovich [Wed, 16 Sep 2015 11:19:28 +0000 (12:19 +0100)]
Check 'fread' return value in program generated with '--skeleton'.

9 years agoSupport '--skeleton' with conditions and multiple re2c blocks.
Ulya Trofimovich [Wed, 16 Sep 2015 10:25:29 +0000 (11:25 +0100)]
Support '--skeleton' with conditions and multiple re2c blocks.

9 years agoMake 'filesize' function for '--skeleton' preserve original file position.
Ulya Trofimovich [Tue, 15 Sep 2015 12:35:27 +0000 (13:35 +0100)]
Make 'filesize' function for '--skeleton' preserve original file position.

9 years agoFixed MINGW builds where 'sizeof (int)' is equal to 'sizeof (long)'.
Ulya Trofimovich [Tue, 15 Sep 2015 12:08:10 +0000 (13:08 +0100)]
Fixed MINGW builds where 'sizeof (int)' is equal to 'sizeof (long)'.

9 years agoChanged '-Wcondition-order' to warn even if 'YYSETCONDITION' is used.
Ulya Trofimovich [Tue, 15 Sep 2015 10:51:36 +0000 (11:51 +0100)]
Changed '-Wcondition-order' to warn even if 'YYSETCONDITION' is used.

Tests 'condtype_yysetcondition.c{s,g}.re' show the reason why I changed
how '-Wcondition-order' works in presence of 'YYSETCONDITION' calls:
programs generated from these tests work differently depending on
condition numbering. Explicit use of condition names cannot guarantee
that these explicit names were generated by re2c (and not hardcoded as
in these examples).

9 years agoMore accurate handling of default rule for '--skeleton'.
Ulya Trofimovich [Tue, 15 Sep 2015 09:45:19 +0000 (10:45 +0100)]
More accurate handling of default rule for '--skeleton'.

9 years agoThe generated '--skeleton' program now warns about undefined control flow.
Ulya Trofimovich [Mon, 14 Sep 2015 22:11:51 +0000 (23:11 +0100)]
The generated '--skeleton' program now warns about undefined control flow.

9 years agoFixed error in calculation of maximal path length in skeleton.
Ulya Trofimovich [Mon, 14 Sep 2015 14:28:52 +0000 (15:28 +0100)]
Fixed error in calculation of maximal path length in skeleton.

9 years agoCompacted keys representation (with '--skeleton').
Ulya Trofimovich [Mon, 14 Sep 2015 12:19:15 +0000 (13:19 +0100)]
Compacted keys representation (with '--skeleton').

Determine maximal path length and maximal rule number while constructing
skeleton; take maximim of these two values; choose unsigned integer type
of minimal width capable of holding maximim.

Note: re2c operates on exact-width integers, but the generated program
doesn't (it might not have <stdint.h>). When generating the program,
re2c choses one of unsigned 'char', 'short', 'int' and 'long' types
(that one 'sizeof' which is equal to the disired key size). re2c makes
some implicit assumptions (generated program is run on the same platform
as re2c, byte consists of 8 bits, etc.). Perhaps re2c should hardcode
these assumptions in the generated program and check them on start.

9 years agoStore keys for '--skeleton' in binary.
Ulya Trofimovich [Fri, 11 Sep 2015 17:48:35 +0000 (18:48 +0100)]
Store keys for '--skeleton' in binary.

A single key is formed of three values:
    1. the length of string
    2. the length of matched part of string
    3. the number of matched rule
All these values are guaranteed to fit 32 bits, so for now we just
dump them as 'uint32_t' and read as 'unsigned int'. re2c asserts that
'sizeof (uint32_t) == sizeof (unsigned int)'.

Avoid structs, as they cause padding issues.

9 years agoEstimate maximal path length in skeleton and abort if it overflows.
Ulya Trofimovich [Wed, 9 Sep 2015 21:43:31 +0000 (22:43 +0100)]
Estimate maximal path length in skeleton and abort if it overflows.

Maximal skeleton path length is a bit different from YYMAXFILL:
it assumes that loops are iterated once (unlike YYMAXFILL calculation,
which disregards loops) and returns zero for empty regexp.

We need to know it in order:
    - to be sure it won't overflow
    - to store keys in a compact form (yet to be done)

This commit also makes DFA and skeleton store condition name and
source file line corresponding to current condition: it gets quite
annoying to pass these things around. This change caused another
change of test results (line numbers in error messages changed
for tests that use '-r' and reuse old DFA (don't reconstruct DFA
in 'use:re2c' blocks).