]> granicus.if.org Git - re2c/log
re2c
5 years agoAdded tests for debug options.
Ulya Trofimovich [Thu, 3 Jan 2019 22:37:54 +0000 (22:37 +0000)]
Added tests for debug options.

5 years agoAdded debug options --dump-cfg and --dump-interf.
Ulya Trofimovich [Thu, 3 Jan 2019 21:58:53 +0000 (21:58 +0000)]
Added debug options --dump-cfg and --dump-interf.

5 years agoMoved debug stuff to a separate subdirectory.
Ulya Trofimovich [Thu, 3 Jan 2019 21:17:22 +0000 (21:17 +0000)]
Moved debug stuff to a separate subdirectory.

5 years agoDisable --dump-* options in non-debug mode.
Ulya Trofimovich [Thu, 3 Jan 2019 20:26:36 +0000 (20:26 +0000)]
Disable --dump-* options in non-debug mode.

5 years agoAdding --enable-debug to Travis CI configuration.
Ulya Trofimovich [Thu, 3 Jan 2019 11:33:04 +0000 (11:33 +0000)]
Adding --enable-debug to Travis CI configuration.

5 years agoAdded some tests for GTOP closure algorithm.
Ulya Trofimovich [Thu, 3 Jan 2019 11:26:51 +0000 (11:26 +0000)]
Added some tests for GTOP closure algorithm.

5 years agoAdded test comparing POSIX closure statistics with GOR1 and GTOP.
Ulya Trofimovich [Thu, 3 Jan 2019 11:13:15 +0000 (11:13 +0000)]
Added test comparing POSIX closure statistics with GOR1 and GTOP.

5 years agoIn debug build, add " (debug)" to version string.
Ulya Trofimovich [Thu, 3 Jan 2019 10:44:24 +0000 (10:44 +0000)]
In debug build, add " (debug)" to version string.

5 years agoSmall simplifications in GOR1 initialization phase.
Ulya Trofimovich [Thu, 3 Jan 2019 09:53:57 +0000 (09:53 +0000)]
Small simplifications in GOR1 initialization phase.

Instead of using two stacks to weed out low-precedence initial
configurations with duplicate target state, let them remain on the
bottom of topsort stack, which effectively makes them ignored.

5 years agoPaper: simplifying GOR1 initialization pseudocode.
Ulya Trofimovich [Wed, 2 Jan 2019 22:57:51 +0000 (22:57 +0000)]
Paper: simplifying GOR1 initialization pseudocode.

5 years agoSmall cosmetic simplifications in POSIX precedence.
Ulya Trofimovich [Wed, 2 Jan 2019 17:54:11 +0000 (17:54 +0000)]
Small cosmetic simplifications in POSIX precedence.

5 years agoUse a specialized simplified version of POSIX comparison for initial configurations.
Ulya Trofimovich [Wed, 2 Jan 2019 17:43:17 +0000 (17:43 +0000)]
Use a specialized simplified version of POSIX comparison for initial configurations.

5 years agoIntrinsic options that do not have configurations should be immutable.
Ulya Trofimovich [Tue, 1 Jan 2019 13:24:10 +0000 (13:24 +0000)]
Intrinsic options that do not have configurations should be immutable.

5 years agoAdded debug option --dump-closure-stats.
Ulya Trofimovich [Tue, 1 Jan 2019 13:13:14 +0000 (13:13 +0000)]
Added debug option --dump-closure-stats.

5 years agoAdjust TNFA construction for bounded repetition to the needs of GOR1.
Ulya Trofimovich [Mon, 31 Dec 2018 20:55:01 +0000 (20:55 +0000)]
Adjust TNFA construction for bounded repetition to the needs of GOR1.

5 years agoOrder initial configurations by POSIX precedence in GOR1.
Ulya Trofimovich [Mon, 31 Dec 2018 12:52:43 +0000 (12:52 +0000)]
Order initial configurations by POSIX precedence in GOR1.

5 years agoAdded option --posix-closure <gor1|gtop>. Removed configurations for internal options.
Ulya Trofimovich [Mon, 31 Dec 2018 12:37:12 +0000 (12:37 +0000)]
Added option --posix-closure <gor1|gtop>. Removed configurations for internal options.

5 years agoAdded GTOP SSSP algorithm for computing epsilon-closure with POSIX disambiguation.
Ulya Trofimovich [Mon, 31 Dec 2018 12:08:42 +0000 (12:08 +0000)]
Added GTOP SSSP algorithm for computing epsilon-closure with POSIX disambiguation.

GTOP SSSP meand "Global Topsort Single Source Shortes Path".

It is well known that SSSP can be solved in linear time on DAGs (directed
acyclic graphs) by exploring graph nodes in topological order. In our case
TNFA is not a DAG (it may have cycles), but it is possible to compute fake
topologcal order by ignoring back edges.

The algorithm works by having a priority queue of nodes, where priorities
are indices of nodes in fake topological ordering. At each step, the node
with the minimal priority is popped from queue and explored. All nodes
reachable from it on admissible arcs are enqueued, unless they are already
on queue.

The resulting algorithm is of course not optimal: it can get stuck on
graphs with loops, because it will give priority to some of the loop nodes
compared to others for no good reason.

However the algorithm is simple and optimal for DAGs, therefore we keep it.

5 years agorun_tests.sh: cleanup .inc files.
Ulya Trofimovich [Mon, 31 Dec 2018 10:57:47 +0000 (10:57 +0000)]
run_tests.sh: cleanup .inc files.

5 years agoConverting tabs to spaces (cosmetic).
Ulya Trofimovich [Sun, 30 Dec 2018 23:35:29 +0000 (23:35 +0000)]
Converting tabs to spaces (cosmetic).

5 years agoEh, some of the asserts were doing useful work and affecting control flow.
Ulya Trofimovich [Sun, 30 Dec 2018 23:30:28 +0000 (23:30 +0000)]
Eh, some of the asserts were doing useful work and affecting control flow.

5 years agoAdded assert wrapper that is turned on/off with --enable-debug configure option.
Ulya Trofimovich [Sun, 30 Dec 2018 19:53:46 +0000 (19:53 +0000)]
Added assert wrapper that is turned on/off with --enable-debug configure option.

5 years agoFixed errors in packing of signed integers.
Ulya Trofimovich [Sun, 30 Dec 2018 17:21:39 +0000 (17:21 +0000)]
Fixed errors in packing of signed integers.

5 years agoReduce boilerplate in option parser with a few macros.
Ulya Trofimovich [Sun, 30 Dec 2018 11:42:38 +0000 (11:42 +0000)]
Reduce boilerplate in option parser with a few macros.

5 years agoUse a couple of helper functions to make string construction easier.
Ulya Trofimovich [Sun, 30 Dec 2018 10:59:28 +0000 (10:59 +0000)]
Use a couple of helper functions to make string construction easier.

5 years agoCorrectly parse -I option with or without space before the argument.
Ulya Trofimovich [Sun, 30 Dec 2018 10:30:19 +0000 (10:30 +0000)]
Correctly parse -I option with or without space before the argument.

5 years agorun_tests.sh: use paths relative to build directory, not to source directory.
Ulya Trofimovich [Sat, 29 Dec 2018 21:40:08 +0000 (21:40 +0000)]
run_tests.sh: use paths relative to build directory, not to source directory.

Otherwise test results aren't reproducible: they depend on the
location of build directory relative to source directory.

5 years agoAdding forgotten files to git.
Ulya Trofimovich [Sat, 29 Dec 2018 14:24:32 +0000 (14:24 +0000)]
Adding forgotten files to git.

5 years agoMore tests for include directive.
Ulya Trofimovich [Sat, 29 Dec 2018 14:17:31 +0000 (14:17 +0000)]
More tests for include directive.

5 years agoResolve names of included files relative to including file, not to the main file.
Ulya Trofimovich [Sat, 29 Dec 2018 00:16:53 +0000 (00:16 +0000)]
Resolve names of included files relative to including file, not to the main file.

5 years agoUse correct order when unreading files from lexer buffer.
Ulya Trofimovich [Sat, 29 Dec 2018 00:06:28 +0000 (00:06 +0000)]
Use correct order when unreading files from lexer buffer.

In lexer buffer nested files come before outer files. In lexer file
stack, however, outer files go before nested files (nested are at the
top). We want to break from the unreading cycle early, therefore we
proceed in reverse order of file offsets in buffer and break as soon
as the end offset is less than cursor (current position).

5 years agoSimplified lexing of include directive.
Ulya Trofimovich [Thu, 27 Dec 2018 22:49:25 +0000 (22:49 +0000)]
Simplified lexing of include directive.

5 years agoTweaking a couple of labels in lexer to simplify updating of token pointer.
Ulya Trofimovich [Thu, 27 Dec 2018 22:38:59 +0000 (22:38 +0000)]
Tweaking a couple of labels in lexer to simplify updating of token pointer.

5 years agoRemoved unused struct field.
Ulya Trofimovich [Thu, 27 Dec 2018 22:17:52 +0000 (22:17 +0000)]
Removed unused struct field.

5 years agoAdded test for EOF rule.
Ulya Trofimovich [Thu, 27 Dec 2018 22:10:49 +0000 (22:10 +0000)]
Added test for EOF rule.

5 years agoTrack current line of each input file separately.
Ulya Trofimovich [Thu, 27 Dec 2018 22:08:38 +0000 (22:08 +0000)]
Track current line of each input file separately.

5 years agoCorrectly handle current directory '.' in include paths.
Ulya Trofimovich [Wed, 26 Dec 2018 20:12:51 +0000 (20:12 +0000)]
Correctly handle current directory '.' in include paths.

6 years agoAdded -I option (paths to include directories).
Ulya Trofimovich [Wed, 26 Dec 2018 11:37:30 +0000 (11:37 +0000)]
Added -I option (paths to include directories).

6 years agoAdded /*!include:re2c ... */ directive.
Ulya Trofimovich [Tue, 25 Dec 2018 19:53:23 +0000 (19:53 +0000)]
Added /*!include:re2c ... */ directive.

6 years agoPreparations to support #include: keep input files in a stack.
Ulya Trofimovich [Sun, 23 Dec 2018 19:32:29 +0000 (19:32 +0000)]
Preparations to support #include: keep input files in a stack.

6 years agoconfigure.ac: set -Wreturn-type to error.
Ulya Trofimovich [Sun, 23 Dec 2018 19:16:02 +0000 (19:16 +0000)]
configure.ac: set -Wreturn-type to error.

6 years agoInitial support of EOF rule.
Ulya Trofimovich [Sat, 22 Dec 2018 23:34:41 +0000 (23:34 +0000)]
Initial support of EOF rule.

6 years agoUpdated unicode tests and test generators for newer versions of unicode.
Ulya Trofimovich [Sat, 22 Dec 2018 11:48:12 +0000 (11:48 +0000)]
Updated unicode tests and test generators for newer versions of unicode.

6 years agoPaper: added two output() functions that convert t-string to parse tree and offsets.
Ulya Trofimovich [Thu, 20 Dec 2018 00:06:42 +0000 (00:06 +0000)]
Paper: added two output() functions that convert t-string to parse tree and offsets.

6 years agoPaper: tweaked TNFA construction.
Ulya Trofimovich [Tue, 18 Dec 2018 23:56:04 +0000 (23:56 +0000)]
Paper: tweaked TNFA construction.

6 years agoLexer: use YYMAXFILL padding and don't forget to shift tag variables in YYFILL.
Ulya Trofimovich [Thu, 6 Dec 2018 22:03:34 +0000 (22:03 +0000)]
Lexer: use YYMAXFILL padding and don't forget to shift tag variables in YYFILL.

This fixes bug #232, #233 and #234.
Found by american fuzzy lop (thanks to Henri Salo).

6 years agoCorrectly identify mapped TDFA state with --dump-dfa-raw option.
Ulya Trofimovich [Thu, 6 Dec 2018 22:01:25 +0000 (22:01 +0000)]
Correctly identify mapped TDFA state with --dump-dfa-raw option.

6 years agoMakefile.am: enable RE2C warnings (-W option).
Ulya Trofimovich [Thu, 29 Nov 2018 22:21:43 +0000 (22:21 +0000)]
Makefile.am: enable RE2C warnings (-W option).

6 years agoFixed read past the end of buffer in configuration parser.
Ulya Trofimovich [Thu, 29 Nov 2018 22:15:18 +0000 (22:15 +0000)]
Fixed read past the end of buffer in configuration parser.

This fixes bug #231.
Found by american fuzzy lop (thanks to Henri Salo).
Also reported by re2c -W (shame on me for not using it all this time!).

6 years agoPaper: tweaking TNFA construction.
Ulya Trofimovich [Mon, 26 Nov 2018 22:58:14 +0000 (22:58 +0000)]
Paper: tweaking TNFA construction.

6 years agoMakefile.am: build autogenerates files before other targets (they may create headers).
Ulya Trofimovich [Thu, 22 Nov 2018 01:00:57 +0000 (01:00 +0000)]
Makefile.am: build autogenerates files before other targets (they may create headers).

Note: I used $(@:cc=*) construct as bmake doesn't understand $*.* .

6 years agoUse tags to lex condition goto.
Ulya Trofimovich [Wed, 21 Nov 2018 22:03:12 +0000 (22:03 +0000)]
Use tags to lex condition goto.

6 years agoStarted using tags in re2c own lexer.
Ulya Trofimovich [Wed, 21 Nov 2018 00:15:15 +0000 (00:15 +0000)]
Started using tags in re2c own lexer.

6 years agoRemoved redundant wrapper around output file struct.
Ulya Trofimovich [Mon, 19 Nov 2018 23:38:07 +0000 (23:38 +0000)]
Removed redundant wrapper around output file struct.

6 years agoDump header on stdout if filename is not set, but /*!header:re2c:on*/ is used.
Ulya Trofimovich [Mon, 19 Nov 2018 23:22:33 +0000 (23:22 +0000)]
Dump header on stdout if filename is not set, but /*!header:re2c:on*/ is used.

6 years agoAdded configurations for -o, --output and -t, --type-header options.
Ulya Trofimovich [Fri, 16 Nov 2018 23:43:41 +0000 (23:43 +0000)]
Added configurations for -o, --output and -t, --type-header options.

6 years agoAdded missing #line info after /*!header:re2c: ... */ directive.
Ulya Trofimovich [Sun, 18 Nov 2018 12:19:27 +0000 (12:19 +0000)]
Added missing #line info after /*!header:re2c: ... */ directive.

Renamed:
    /*!header:re2c:1*/ -> /*!header:re2c:on*/
    /*!header:re2c:0*/ -> /*!header:re2c:off*/

6 years agoAdded /*!header:re2c:0*/ and /*!header:re2c:1*/ directives.
Ulya Trofimovich [Sun, 18 Nov 2018 10:50:00 +0000 (10:50 +0000)]
Added /*!header:re2c:0*/ and /*!header:re2c:1*/ directives.

Combined with -t, --type-header option, this allows to put arbitrary
parts of the generated output in a header file.

6 years agoTweaking condition list lexer.
Ulya Trofimovich [Fri, 16 Nov 2018 00:36:11 +0000 (00:36 +0000)]
Tweaking condition list lexer.

6 years agoMerge pull request #230 from sergeyklay/patch-1
Ulya Trofimovich [Wed, 21 Nov 2018 21:54:58 +0000 (21:54 +0000)]
Merge pull request #230 from sergeyklay/patch-1

Changes for upcoming Travis' infra migration

6 years agoChanges for upcoming Travis' infra migration 230/head
Serghei Iakovlev [Wed, 21 Nov 2018 20:23:09 +0000 (22:23 +0200)]
Changes for upcoming Travis' infra migration

See: https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration

6 years agoFixed segfault cause by out of bounds access.
Ulya Trofimovich [Thu, 15 Nov 2018 07:33:25 +0000 (07:33 +0000)]
Fixed segfault cause by out of bounds access.

This fixes bug #227.
Found by american fuzzy lop (thanks to Henri Salo).

6 years agoMoved tests into subdirectories.
Ulya Trofimovich [Wed, 14 Nov 2018 22:58:47 +0000 (22:58 +0000)]
Moved tests into subdirectories.

6 years agoFixed a couple of lexer/parser errors in flex mode (-F option).
Ulya Trofimovich [Tue, 13 Nov 2018 23:42:11 +0000 (23:42 +0000)]
Fixed a couple of lexer/parser errors in flex mode (-F option).

This fixes bug #229: re2c option -F (flex syntax) broken,
reported by Robert van Engelen.

A well-formed example that caused syntax error (flex-style raw literal
followed by one or more spaces and a curly brace):

/*!re2c
    a {}
*/

The faulty behaviour goes back as far as re2c-0.13.6 (and supposedly
before that): in flex mode, raw literal may occur in various contexts
both as a regexp (string literal) and an identifier (named definition,
condiiton name). RE2C uses lookahead to infer the context and determine
the appropriate type of lexer token, but it missed some cases.

The fix has two sides. First, if reduces the number of contexts where
the general lexer may encounter raw literal (by using specialized lexers
for condition lists <x,y,...,z> and condition goto => and :=>). Second,
it fixes the lookahead regexps used for context inference.

Also added a bunch of tests (generated by a script).

6 years agoSuppress -Wnullable warning on <> condition (it has no regexp -- always empty).
Ulya Trofimovich [Tue, 13 Nov 2018 23:38:02 +0000 (23:38 +0000)]
Suppress -Wnullable warning on <> condition (it has no regexp -- always empty).

6 years agoAdjusting formatting (cosmetic).
Ulya Trofimovich [Mon, 5 Nov 2018 23:35:11 +0000 (23:35 +0000)]
Adjusting formatting (cosmetic).

6 years agoFixed out of bounds read in configuration lexer (not handling EOF in configuration...
Ulya Trofimovich [Sun, 4 Nov 2018 22:38:56 +0000 (22:38 +0000)]
Fixed out of bounds read in configuration lexer (not handling EOF in configuration value).

Found by american fuzzy lop (thanks to Henri Salo).

6 years agoSmall tweaks in lexer subroutines for semantic actions.
Ulya Trofimovich [Thu, 1 Nov 2018 00:01:25 +0000 (00:01 +0000)]
Small tweaks in lexer subroutines for semantic actions.

6 years agoFixed yet another out of bounds read in lexer due to not handling EOF after escape.
Ulya Trofimovich [Wed, 31 Oct 2018 23:15:28 +0000 (23:15 +0000)]
Fixed yet another out of bounds read in lexer due to not handling EOF after escape.

Found by american fuzzy lop (thanks to Henri Salo).

6 years agoFixed some more out of bounds reads in lexer due to not handling EOF properly.
Ulya Trofimovich [Tue, 30 Oct 2018 22:11:32 +0000 (22:11 +0000)]
Fixed some more out of bounds reads in lexer due to not handling EOF properly.

Found by american fuzzy lop (thanks to Henri Salo).

6 years agoAdjusted formatting in the lexer (cosmetic).
Ulya Trofimovich [Mon, 29 Oct 2018 23:32:22 +0000 (23:32 +0000)]
Adjusted formatting in the lexer (cosmetic).

6 years agoFixed out of bounds read in lexer.
Ulya Trofimovich [Mon, 29 Oct 2018 23:00:50 +0000 (23:00 +0000)]
Fixed out of bounds read in lexer.

The error was caused by assuming that a sequence of zeroes (used for
padding in YYFILL) cannot form a valid lexeme suffix. This is not the
case with strings, as they may contain arbitrary characters. The fix
is to manually loop over string characters in lexer, stopping at each
zero to check if it's the end of input.

Found by american fuzzy lop (thanks to Henri Salo).

6 years agoUpdated README in libre2c (added a warning that the library is not maintained).
Ulya Trofimovich [Sun, 28 Oct 2018 09:53:14 +0000 (09:53 +0000)]
Updated README in libre2c (added a warning that the library is not maintained).

6 years agoUpdated README.
Ulya Trofimovich [Sun, 28 Oct 2018 09:49:06 +0000 (09:49 +0000)]
Updated README.

6 years agoPaper: made TNFA description closer to practice.
Ulya Trofimovich [Sat, 27 Oct 2018 21:32:49 +0000 (22:32 +0100)]
Paper: made TNFA description closer to practice.

6 years agoPaper: tweaks in the TNFA example.
Ulya Trofimovich [Tue, 23 Oct 2018 21:12:20 +0000 (22:12 +0100)]
Paper: tweaks in the TNFA example.

6 years agoPaper: changed description of GOR1 following the rework of the algorithm.
Ulya Trofimovich [Sat, 20 Oct 2018 21:06:59 +0000 (22:06 +0100)]
Paper: changed description of GOR1 following the rework of the algorithm.

6 years agoPaper: tweaks in picture layout.
Ulya Trofimovich [Wed, 17 Oct 2018 21:29:31 +0000 (22:29 +0100)]
Paper: tweaks in picture layout.

6 years agoPaper: added TNFA example.
Ulya Trofimovich [Mon, 15 Oct 2018 06:32:04 +0000 (07:32 +0100)]
Paper: added TNFA example.

6 years agoPaper: added example for "empty match is better than no match" POSIX rule.
Ulya Trofimovich [Sun, 14 Oct 2018 09:31:13 +0000 (10:31 +0100)]
Paper: added example for "empty match is better than no match" POSIX rule.

6 years agoPaper: packed multiple examples of PE comparison in one page.
Ulya Trofimovich [Fri, 12 Oct 2018 22:37:47 +0000 (23:37 +0100)]
Paper: packed multiple examples of PE comparison in one page.

6 years agoPaper: dropped explicit submatch indices in TNFA definition.
Ulya Trofimovich [Thu, 11 Oct 2018 22:38:03 +0000 (23:38 +0100)]
Paper: dropped explicit submatch indices in TNFA definition.

6 years agoPaper: handle (e) as (e){1,1} to avoid collapsing multiple submatch groups into one.
Ulya Trofimovich [Thu, 11 Oct 2018 06:08:48 +0000 (07:08 +0100)]
Paper: handle (e) as (e){1,1} to avoid collapsing multiple submatch groups into one.

Previously we collapsed ((e)) into (e).

6 years agoPaper: minor tweaks in pseudocode.
Ulya Trofimovich [Mon, 8 Oct 2018 21:56:57 +0000 (22:56 +0100)]
Paper: minor tweaks in pseudocode.

6 years agoPaper: added description of GTOP closure algorithm.
Ulya Trofimovich [Sat, 6 Oct 2018 22:44:35 +0000 (23:44 +0100)]
Paper: added description of GTOP closure algorithm.

6 years agoPaper: made GOR pseudocode slightly easier to read.
Ulya Trofimovich [Sat, 6 Oct 2018 21:24:04 +0000 (22:24 +0100)]
Paper: made GOR pseudocode slightly easier to read.

6 years agoMerge pull request #224 from trofi/master
Ulya Trofimovich [Mon, 22 Oct 2018 22:33:02 +0000 (23:33 +0100)]
Merge pull request #224 from trofi/master

src/dfa/closure_posix.cc: pack() tweaks

6 years agosrc/dfa/closure_posix.cc: fix pack() to drop two highest bits 224/head
Sergei Trofimovich [Mon, 22 Oct 2018 22:05:56 +0000 (23:05 +0100)]
src/dfa/closure_posix.cc: fix pack() to drop two highest bits

```c
longest | (leftmost << 30);
```
assumes `longest` does not exceed 30 bits. It could if
it's a negative value originally.

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
6 years agosrc/dfa/closure_posix.cc: fix signed shift overflow
Sergei Trofimovich [Mon, 22 Oct 2018 21:58:34 +0000 (22:58 +0100)]
src/dfa/closure_posix.cc: fix signed shift overflow

signed shift overflow is not defined by C standard.
clang++ -fsanitize=undefined detects it as:

```
src/dfa/closure_posix.cc:207:32: runtime error: left shift of negative value -1
```

This change wraps bit shift arithmetics into unsigned types.

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
6 years agoMerge pull request #223 from metab0t/master
Ulya Trofimovich [Wed, 17 Oct 2018 22:29:07 +0000 (23:29 +0100)]
Merge pull request #223 from metab0t/master

Fix typo

6 years agoFix typo 223/head
Nerd [Wed, 17 Oct 2018 08:23:22 +0000 (16:23 +0800)]
Fix typo

6 years agoMerge pull request #222 from trofi/master
Ulya Trofimovich [Tue, 16 Oct 2018 22:36:15 +0000 (23:36 +0100)]
Merge pull request #222 from trofi/master

configure.ac: enable xz tarballs instead of gzip by default

6 years agoconfigure.ac: enable xz tarballs instead of gzip by default 222/head
Sergei Trofimovich [Tue, 16 Oct 2018 19:36:53 +0000 (20:36 +0100)]
configure.ac: enable xz tarballs instead of gzip by default

`xz` compresses twice as good as `gzip` on `re2c` sources:

```
$ ls -lh *1.1.1*
4,8M re2c-1.1.1.tar.gz
2,5M re2c-1.1.1.tar.xz
```

Switch `make dist` to `xz by default. `gzip` is still available
via `make dist-gzip`.

Reported-by: rofl0r
Bug: https://github.com/skvadrik/re2c/issues/221
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
6 years agoPaper: added examples of the three rules of POSIX disambiguation.
Ulya Trofimovich [Thu, 6 Sep 2018 21:45:30 +0000 (22:45 +0100)]
Paper: added examples of the three rules of POSIX disambiguation.

6 years agoMerge pull request #220 from trofi/master
Ulya Trofimovich [Sat, 29 Sep 2018 21:29:34 +0000 (22:29 +0100)]
Merge pull request #220 from trofi/master

src/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug

6 years agosrc/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug 220/head
Sergei Trofimovich [Sat, 29 Sep 2018 21:11:27 +0000 (22:11 +0100)]
src/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug

On g++-3.4.6 re2c tests SIGSEGVed due to use of uninitialized data:

```
$ valgrind ... ./re2c -8 a.re -o foo.c
Conditional jump or move depends on uninitialised value(s)
   at 0x432F23: re2c::tcpool_t::insert(re2c::tcmd_t const*) (tcmd.cc:202)
   by 0x421FDA: re2c::freeze_tags(re2c::dfa_t&) (freeze.cc:45)
   by 0x43A7FF: re2c::ast_to_dfa(re2c::spec_t const&, re2c::Output&) (compile.cc:88)
   by 0x43B052: push_back (stl_iterator.h:614)
   by 0x43B052: re2c::compile(re2c::Scanner&, re2c::Output&, re2c::Opt&) (???:0)
   by 0x449D29: main (main.cc:31)
 Uninitialised value was created by a heap allocation
   at 0x403252F: operator new[](unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x42FC9E: re2c::find_state(re2c::determ_context_t&) (dfa.h:37)
   by 0x429BD9: re2c::dfa_t::dfa_t(re2c::nfa_t const&, re2c::opt_t const*, std::string const&, re2c::Warn&) (determinization.cc:56)
   by 0x43A76C: re2c::ast_to_dfa(re2c::spec_t const&, re2c::Output&) (compile.cc:69)
   by 0x43B052: push_back (stl_iterator.h:614)
   by 0x43B052: re2c::compile(re2c::Scanner&, re2c::Output&, re2c::Opt&) (???:0)
   by 0x449D29: main (main.cc:31)
```

the problem here arose in default array constructor:

```c++
     explicit dfa_state_t(size_t nchars)
         : // ...
         , tcmd(new tcmd_t*[nchars + 2]()) // +2 for final and fallback epsilon-transitions
         // ...
```

g++-3.4.6 can't figure out zero-initialization rule (likely a gcc bug).

The change uses non-initializing new[] and memset() instead.

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
6 years agoMerge pull request #216 from trofi/master
Ulya Trofimovich [Tue, 4 Sep 2018 19:49:46 +0000 (20:49 +0100)]
Merge pull request #216 from trofi/master

.travis.yml: run all tests behind 'make check'

6 years agoMerge pull request #217 from trofi/add-msan
Ulya Trofimovich [Tue, 4 Sep 2018 19:42:51 +0000 (20:42 +0100)]
Merge pull request #217 from trofi/add-msan

__alltest.sh: add clang's -fsanitize=memory flavour

6 years agoFixed bug #215 "A memory read overrun issue in s_to_n32_unsafe.cc".
Ulya Trofimovich [Tue, 4 Sep 2018 19:27:40 +0000 (20:27 +0100)]
Fixed bug #215 "A memory read overrun issue in s_to_n32_unsafe.cc".

The error was in the code of the test itself: the special case of zero
wasn't handled correctrly by the function that prepares input data for
the test. As a result, zero-length input string was passed to the test,
which is unexpected: the tested function is an "unsafe" one (as the
name suggests) and is meant to be used on an already validated input.

6 years ago__alltest.sh: add clang's -fsanitize=memory flavour 217/head
Sergei Trofimovich [Tue, 4 Sep 2018 19:16:23 +0000 (20:16 +0100)]
__alltest.sh: add clang's -fsanitize=memory flavour

Bug: https://github.com/skvadrik/re2c/issues/215
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>