Sergei Trofimovich [Tue, 16 Oct 2018 19:36:53 +0000 (20:36 +0100)]
configure.ac: enable xz tarballs instead of gzip by default
`xz` compresses twice as good as `gzip` on `re2c` sources:
```
$ ls -lh *1.1.1*
4,8M re2c-1.1.1.tar.gz
2,5M re2c-1.1.1.tar.xz
```
Switch `make dist` to `xz by default. `gzip` is still available
via `make dist-gzip`.
Reported-by: rofl0r
Bug: https://github.com/skvadrik/re2c/issues/221
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Ulya Trofimovich [Thu, 6 Sep 2018 21:45:30 +0000 (22:45 +0100)]
Paper: added examples of the three rules of POSIX disambiguation.
Ulya Trofimovich [Sat, 29 Sep 2018 21:29:34 +0000 (22:29 +0100)]
Merge pull request #220 from trofi/master
src/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug
Sergei Trofimovich [Sat, 29 Sep 2018 21:11:27 +0000 (22:11 +0100)]
src/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug
On g++-3.4.6 re2c tests SIGSEGVed due to use of uninitialized data:
```
$ valgrind ... ./re2c -8 a.re -o foo.c
Conditional jump or move depends on uninitialised value(s)
at 0x432F23: re2c::tcpool_t::insert(re2c::tcmd_t const*) (tcmd.cc:202)
by 0x421FDA: re2c::freeze_tags(re2c::dfa_t&) (freeze.cc:45)
by 0x43A7FF: re2c::ast_to_dfa(re2c::spec_t const&, re2c::Output&) (compile.cc:88)
by 0x43B052: push_back (stl_iterator.h:614)
by 0x43B052: re2c::compile(re2c::Scanner&, re2c::Output&, re2c::Opt&) (???:0)
by 0x449D29: main (main.cc:31)
Uninitialised value was created by a heap allocation
at 0x403252F: operator new[](unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x42FC9E: re2c::find_state(re2c::determ_context_t&) (dfa.h:37)
by 0x429BD9: re2c::dfa_t::dfa_t(re2c::nfa_t const&, re2c::opt_t const*, std::string const&, re2c::Warn&) (determinization.cc:56)
by 0x43A76C: re2c::ast_to_dfa(re2c::spec_t const&, re2c::Output&) (compile.cc:69)
by 0x43B052: push_back (stl_iterator.h:614)
by 0x43B052: re2c::compile(re2c::Scanner&, re2c::Output&, re2c::Opt&) (???:0)
by 0x449D29: main (main.cc:31)
```
the problem here arose in default array constructor:
```c++
explicit dfa_state_t(size_t nchars)
: // ...
, tcmd(new tcmd_t*[nchars + 2]()) // +2 for final and fallback epsilon-transitions
// ...
```
g++-3.4.6 can't figure out zero-initialization rule (likely a gcc bug).
The change uses non-initializing new[] and memset() instead.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Ulya Trofimovich [Tue, 4 Sep 2018 19:49:46 +0000 (20:49 +0100)]
Merge pull request #216 from trofi/master
.travis.yml: run all tests behind 'make check'
Ulya Trofimovich [Tue, 4 Sep 2018 19:42:51 +0000 (20:42 +0100)]
Merge pull request #217 from trofi/add-msan
__alltest.sh: add clang's -fsanitize=memory flavour
Ulya Trofimovich [Tue, 4 Sep 2018 19:27:40 +0000 (20:27 +0100)]
Fixed bug #215 "A memory read overrun issue in s_to_n32_unsafe.cc".
The error was in the code of the test itself: the special case of zero
wasn't handled correctrly by the function that prepares input data for
the test. As a result, zero-length input string was passed to the test,
which is unexpected: the tested function is an "unsafe" one (as the
name suggests) and is meant to be used on an already validated input.
Sergei Trofimovich [Tue, 4 Sep 2018 19:16:23 +0000 (20:16 +0100)]
__alltest.sh: add clang's -fsanitize=memory flavour
Bug: https://github.com/skvadrik/re2c/issues/215
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Sergei Trofimovich [Tue, 4 Sep 2018 18:59:05 +0000 (19:59 +0100)]
.travis.yml: run all tests behind 'make check'
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Ulya Trofimovich [Thu, 30 Aug 2018 22:16:10 +0000 (23:16 +0100)]
Release 1.1.1.
Ulya Trofimovich [Thu, 30 Aug 2018 22:10:21 +0000 (23:10 +0100)]
Converted tabs to spaces in .re files and autogenerated files.
Ulya Trofimovich [Thu, 30 Aug 2018 22:00:56 +0000 (23:00 +0100)]
Updated CHANGELOG.
Ulya Trofimovich [Thu, 30 Aug 2018 21:51:53 +0000 (22:51 +0100)]
Makefile.am: reduced redundant variables.
Ulya Trofimovich [Thu, 30 Aug 2018 21:45:04 +0000 (22:45 +0100)]
Makefile.am: simplified clean-up part of bootstrap rule.
Ulya Trofimovich [Thu, 30 Aug 2018 21:38:42 +0000 (22:38 +0100)]
Rewrote version-to-vernum converter in RE2C; added more unit tests.
Sergei Trofimovich [Tue, 28 Aug 2018 22:05:59 +0000 (23:05 +0100)]
vernum: move version-string-to-vernum converter to a separate helper
No functional change. While at it added tests
to cover past failures:
- "1.1": https://github.com/skvadrik/re2c/issues/211
- "0.14": https://sourceforge.net/p/re2c/bugs/55/
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Mike Gilbert [Tue, 28 Aug 2018 16:01:07 +0000 (12:01 -0400)]
Rewrite vernum function
Fixes: https://github.com/skvadrik/re2c/issues/211
Ulya Trofimovich [Mon, 27 Aug 2018 21:44:50 +0000 (22:44 +0100)]
Release 1.1.
Ulya Trofimovich [Mon, 27 Aug 2018 20:45:44 +0000 (21:45 +0100)]
Regenerated docs.
Ulya Trofimovich [Mon, 27 Aug 2018 20:42:33 +0000 (21:42 +0100)]
Updated CHANGELOG.
Ulya Trofimovich [Mon, 27 Aug 2018 20:26:56 +0000 (21:26 +0100)]
Paper: more tweaks in examples of trace computation.
Ulya Trofimovich [Tue, 14 Aug 2018 06:10:26 +0000 (07:10 +0100)]
Increase allocator alignment to pointer size to avoid unaligned reads/writes.
Unaligned operations found by ubsan.
Ulya Trofimovich [Mon, 13 Aug 2018 22:41:56 +0000 (23:41 +0100)]
Fixed memory corruption bug (caused by wrong size passed to memcpy).
Found by asan.
Ulya Trofimovich [Mon, 13 Aug 2018 22:21:57 +0000 (23:21 +0100)]
Reordered function definitions.
Ulya Trofimovich [Mon, 13 Aug 2018 22:11:42 +0000 (23:11 +0100)]
Moved different closure construction algorithms to separate files.
Ulya Trofimovich [Mon, 13 Aug 2018 22:02:01 +0000 (23:02 +0100)]
Moved POSIX disambiguation algorithm to a separate file.
Ulya Trofimovich [Mon, 13 Aug 2018 21:49:44 +0000 (22:49 +0100)]
Converted tabs to spaces.
Ulya Trofimovich [Mon, 13 Aug 2018 21:43:16 +0000 (22:43 +0100)]
Renamed a couple of structs.
Ulya Trofimovich [Mon, 13 Aug 2018 20:47:13 +0000 (21:47 +0100)]
Merged a couple of small headers into one.
Ulya Trofimovich [Sun, 12 Aug 2018 19:38:11 +0000 (20:38 +0100)]
Gathered all determinization-related data in a struct to avoid passing many parameters.
Ulya Trofimovich [Sun, 12 Aug 2018 19:28:09 +0000 (20:28 +0100)]
Use fixed 32-bit indices in lookup tables instead of 'size_t'.
Ulya Trofimovich [Sat, 11 Aug 2018 08:01:57 +0000 (09:01 +0100)]
Moved all notes (lengthy comments with names) to the beginning of file.
Ulya Trofimovich [Fri, 10 Aug 2018 23:59:09 +0000 (00:59 +0100)]
Rearranged the code a bit with a couple of helper subroutines.
Ulya Trofimovich [Fri, 10 Aug 2018 23:32:11 +0000 (00:32 +0100)]
Don't use a dedicated struct for returning multiple values from function.
Ulya Trofimovich [Fri, 10 Aug 2018 23:13:04 +0000 (00:13 +0100)]
Simplified back up / restore of tag actions when mapping TDFA states.
Ulya Trofimovich [Fri, 10 Aug 2018 05:57:05 +0000 (06:57 +0100)]
Gathered various buffers used for TDFA state mapping in a struct.
Ulya Trofimovich [Thu, 9 Aug 2018 21:28:41 +0000 (22:28 +0100)]
Use somewhat more consistent variable naming.
Ulya Trofimovich [Thu, 9 Aug 2018 20:45:38 +0000 (21:45 +0100)]
Replaced Kuklewicz POSIX disambiguation algorithm with Okui algorithm.
Changes in the test results are caused by putting negative tags of the
right alternative *before* the alternative.
Ulya Trofimovich [Mon, 6 Aug 2018 21:37:38 +0000 (22:37 +0100)]
Pack tag index and sign into one 32-bit field.
Ulya Trofimovich [Sun, 5 Aug 2018 09:55:58 +0000 (10:55 +0100)]
Compute and store tag "height" (needed for Okui disambiguation).
Ulya Trofimovich [Sat, 4 Aug 2018 09:44:56 +0000 (10:44 +0100)]
Always add structural tags to the RHS of alternative/catenation in POSIX captures.
(Preliminary work before switching from Kuklewicz POSIX disambiguation
algorithm to Okui algorithm.)
Ulya Trofimovich [Sat, 4 Aug 2018 09:25:01 +0000 (10:25 +0100)]
Don't move the closing tag of POSIX capture group out of the enclosing iteration.
RE2C used to perform the following optimization: when a POSIX capture is
under iteration, we only need to get tag values of the last iteration
(according to the POSIX standard). Therefore we can move the closing tag
out of loop.
This commit removes this optimization (as part of the effort to switch
from Kuklewicz POSIX disambiguation algorthm to Okui algorithm).
In other words, for RE (x)* re2c used to generate this "optimized" IRE:
1 (3 x)* 4 2
and now it generates the "canonical" IRE:
1 (3 x 4)* 2
Updated tests for '--posix-captures' that have been affected by the change.
Ulya Trofimovich [Sat, 4 Aug 2018 09:09:01 +0000 (10:09 +0100)]
Allow default copy for POD struct (fixes [-Wclass-memaccess] GCC warning).
Ulya Trofimovich [Sat, 28 Jul 2018 22:30:04 +0000 (23:30 +0100)]
Updated GOR1 (fixed the core algorithm to avoid useless re-scans of the same state).
Also, depth-first traversal was done in a slightly incorrect way:
we checked outgoing nodes for admissibility and pushed the corresponding
child states on stack all at once. This is not the same as checking
the first child and recursing into it, then checking the next child,
..., and so on (because we might discover the second child while exploring
the first, and admissiblitiy check for the second child *after* that
might yield false, while *before* exploring the first child it yielded
true).
Ulya Trofimovich [Sat, 28 Jul 2018 21:52:44 +0000 (22:52 +0100)]
Pick the shortest available path suffix when generating skeleton path cover.
This also fixes a error in the generation process: sometimes in case
of loops the current node's suffix was set before all of its children
were processed.
Updated test results (in some cases .input files became larger because
of the above fix, in some cases they became smaller because we now pick
the shortest suffix).
Added new test; this one was found by slyfox's fuzzer and revealed the
above bug.
Ulya Trofimovich [Sat, 28 Jul 2018 09:53:34 +0000 (10:53 +0100)]
Changed the name of a local variable in the test to avoid collision with skeleton names.
Before tags were added to re2c, skeleton programs only used a limited
number of predefined names, such as 'yych', 'yystate', etc. With tags,
however, this is no longer true as tags may have any names. So now we need
to be more cautios when picking names for sekleton variables.
This patch is only a workaround to make all tests pass; the real solution
requires inventing a good naming scheme for skeleton programs and
regenerating all skeleton test results.
Ulya Trofimovich [Sat, 28 Jul 2018 09:36:08 +0000 (10:36 +0100)]
Paper: more tweaks of GOR1.
Ulya Trofimovich [Sat, 28 Jul 2018 09:21:37 +0000 (10:21 +0100)]
Fixed error in calculation of maximal skeleton path length.
The error was found by slyfox's fuzzer (a randomly-generated skeleton test).
The bug in the code was, apparently, too early modification of the state's
estimated maximal distance to the end states: the distance was set before
all of the state's children were processed, which resulted in aborting the
accumulation of distance from the remaining children, and, as a consequence,
shorter than necessary max distance for the root itself.
Ulya Trofimovich [Wed, 25 Jul 2018 21:12:23 +0000 (22:12 +0100)]
Paper: updated version of GOR1.
Ulya Trofimovich [Fri, 6 Jul 2018 23:01:15 +0000 (00:01 +0100)]
Paper: some tweaks for the examples of traces computation.
Ulya Trofimovich [Mon, 2 Jul 2018 22:06:40 +0000 (23:06 +0100)]
Paper: another example of traces computation.
Ulya Trofimovich [Sat, 30 Jun 2018 20:23:13 +0000 (21:23 +0100)]
Paper: added example of PEs and traces computation.
Ulya Trofimovich [Mon, 25 Jun 2018 21:42:33 +0000 (22:42 +0100)]
Fixed processing of #line directives in input files.
The correct behaviour was broken somewhere in between 0.16 and 1.0:
re2c was forgetting to output the chunk of input file that precedes
the #line directive.
Reported by pskocik in #98.
Ulya Trofimovich [Sun, 24 Jun 2018 22:06:06 +0000 (23:06 +0100)]
Paper: re-worked the theorem about compatibility of total and partial orders.
Ulya Trofimovich [Sun, 24 Jun 2018 08:37:48 +0000 (09:37 +0100)]
Paper: started re-working the theorem about compatibility of total and partial orders.
Ulya Trofimovich [Sat, 23 Jun 2018 10:06:02 +0000 (11:06 +0100)]
Paper: made example about parse trees consistent with its description.
Ulya Trofimovich [Sat, 23 Jun 2018 09:57:58 +0000 (10:57 +0100)]
Paper: continued restructuring the part about indexed parse trees.
Ulya Trofimovich [Wed, 20 Jun 2018 21:29:17 +0000 (22:29 +0100)]
Paper: restructured the IRE construction example.
Ulya Trofimovich [Mon, 18 Jun 2018 22:14:22 +0000 (23:14 +0100)]
Paper: added an example of IRE construction.
Ulya Trofimovich [Sun, 17 Jun 2018 09:21:02 +0000 (10:21 +0100)]
Paper: added introduction to the second chapter.
Ulya Trofimovich [Sat, 16 Jun 2018 09:41:18 +0000 (10:41 +0100)]
Paper: revise basic definitions before introducing partial order on trees.
Ulya Trofimovich [Wed, 13 Jun 2018 22:00:45 +0000 (23:00 +0100)]
paper: taken care of Angelo's remarks.
Ulya Trofimovich [Mon, 11 Jun 2018 20:27:27 +0000 (21:27 +0100)]
Added option "--conditions" (an alias for "-c" and "--start-conditions").
Fixes issue #206 "wrong long option for -c mode".
Ulya Trofimovich [Thu, 24 May 2018 22:44:23 +0000 (23:44 +0100)]
Added first part of TDFA paper v2.
Ulya Trofimovich [Wed, 25 Apr 2018 21:49:15 +0000 (22:49 +0100)]
Improved error reporting in fuzz-testing script.
Ulya Trofimovich [Sat, 14 Apr 2018 20:50:32 +0000 (21:50 +0100)]
If the input starts with a re2c block, apply re2c configurations immediately. (see #201).
Ulya Trofimovich [Fri, 13 Apr 2018 23:23:35 +0000 (00:23 +0100)]
Escape backslashes in file names (see #201).
Ulya Trofimovich [Wed, 8 Nov 2017 20:40:53 +0000 (20:40 +0000)]
Release 1.0.3.
Ulya Trofimovich [Wed, 8 Nov 2017 07:19:21 +0000 (07:19 +0000)]
Fix for #198.
GCC-4.2.1 is unable to compile code like this:
std::vector<int> v;
std::vector<int>::const_reverse_iterator i;
for (i = v.rbegin(); i != v.rend(); ++i) ;
It's unable to deduce const overload for 'rend':
"no match for ‘operator!=’ in ‘i != std::vector<_Tp, _Alloc>::rend()"
However, the following code compiles fine:
std::vector<int> v;
std::vector<int>::const_reverse_iterator i = v.rbegin(), e = v.rend();
for (i != e; ++i) ;
This was reported by Ryan Shmidt.
Ulya Trofimovich [Thu, 14 Sep 2017 19:08:37 +0000 (20:08 +0100)]
Fixed typo in docs (found by Maxim Reznik).
Ulya Trofimovich [Mon, 28 Aug 2017 16:33:46 +0000 (17:33 +0100)]
Removed unaccurate example.
Parsing floating-point numbers is hard and re2c doesn't help much,
so this example was somewhat misleading.
Ulya Trofimovich [Sat, 26 Aug 2017 20:07:06 +0000 (21:07 +0100)]
Release 1.0.2.
Ulya Trofimovich [Sat, 26 Aug 2017 20:02:22 +0000 (21:02 +0100)]
Updated changelog.
Ulya Trofimovich [Sat, 26 Aug 2017 19:26:26 +0000 (20:26 +0100)]
Some more fixes to the documentation.
Ulya Trofimovich [Sat, 26 Aug 2017 18:10:24 +0000 (19:10 +0100)]
Updated documentation.
Ulya Trofimovich [Sat, 26 Aug 2017 09:31:35 +0000 (10:31 +0100)]
Disallow condition names and named definitions to start with digit.
This has always been the intended behavior and was accidentally broken
by commit
e3db638fc3e9bfb318edafedbefd02f25f1c1b8c.
Ulya Trofimovich [Tue, 22 Aug 2017 20:39:33 +0000 (21:39 +0100)]
Renamed tests.
Ulya Trofimovich [Tue, 22 Aug 2017 17:06:55 +0000 (18:06 +0100)]
Updated examples and added them to 'run_tests.sh' script.
Ulya Trofimovich [Tue, 22 Aug 2017 08:15:47 +0000 (09:15 +0100)]
Updated changelog.
Ulya Trofimovich [Tue, 22 Aug 2017 08:09:35 +0000 (09:09 +0100)]
Added forgotten 'genhelp.sh' to distribution files.
This fixes bug #194 "Build with "--enable-docs" fails".
Ulya Trofimovich [Fri, 18 Aug 2017 15:09:59 +0000 (16:09 +0100)]
Added examples to test suite.
Ulya Trofimovich [Wed, 16 Aug 2017 17:43:13 +0000 (18:43 +0100)]
Added benchmarks to test suite.
Ulya Trofimovich [Fri, 11 Aug 2017 21:43:05 +0000 (22:43 +0100)]
Updated changelog for 1.0.1 version.
Ulya Trofimovich [Fri, 11 Aug 2017 21:31:09 +0000 (22:31 +0100)]
Release 1.0.1.
Ulya Trofimovich [Fri, 11 Aug 2017 21:16:04 +0000 (22:16 +0100)]
Makefile.am: add paper on Lookahead TDFA to distribution.
Ulya Trofimovich [Fri, 11 Aug 2017 21:04:05 +0000 (22:04 +0100)]
Fixed #193: "1.0 build failure on macOS: error: calling a private constructor of class 're2c::Rule'".
Copy constructor and assignment are requred by std::valarray
implementation on macOS.
Ulya Trofimovich [Fri, 11 Aug 2017 13:46:23 +0000 (14:46 +0100)]
Release 1.0.
Ulya Trofimovich [Fri, 11 Aug 2017 11:52:10 +0000 (12:52 +0100)]
Paper on lookahead TDFA: finished.
Ulya Trofimovich [Thu, 10 Aug 2017 15:03:46 +0000 (16:03 +0100)]
Updated help and manpage.
Ulya Trofimovich [Thu, 10 Aug 2017 12:25:07 +0000 (13:25 +0100)]
Leave the definition of 'yynmatch' and 'yypmatch' to the user.
With '--posix-captures' RE2C stores submatch results in 'yynmatch'
(the total number of capturing groups for the matching rule) and
'yypmatch' (an array of submatch values for each group).
These variables should be user-defined, so that users can override
default implementation (e.g. make 'yypmatch' an array of integer
offsets rather than an array of pointers). Overriding is only possible
with generic API: if default API is used, then RE2C can autogenerate
'yynmatch' and 'yypmatch' (and so it did prior to this commit).
However, it is better to have the same behavior with both APIs; also,
it is coherent with '--tags' option (RE2C leaves tag definition to
the user).
Ulya Trofimovich [Wed, 9 Aug 2017 17:35:15 +0000 (18:35 +0100)]
Updated options list and regenerated docs.
Ulya Trofimovich [Wed, 9 Aug 2017 17:07:50 +0000 (18:07 +0100)]
Added short option '-P' corresponding to '--posix-captures'.
Ulya Trofimovich [Wed, 9 Aug 2017 16:17:09 +0000 (17:17 +0100)]
Fixed includes with 'include-what-you-use'.
Command:
$ configure CXX=include-what-you-use CXXFLAGS="--check-also" \
&& make -k 2>log \
&& python2 `which fix_inclydes.py` < log
Ulya Trofimovich [Wed, 9 Aug 2017 13:04:10 +0000 (14:04 +0100)]
Paper on Lookahead TDFA: added bibliography.
Ulya Trofimovich [Wed, 9 Aug 2017 07:47:51 +0000 (08:47 +0100)]
Amended README instructions for benchmarks.
Ulya Trofimovich [Mon, 7 Aug 2017 11:57:30 +0000 (12:57 +0100)]
Paper on Lookahead TDFA: added pictures.
Ulya Trofimovich [Mon, 7 Aug 2017 11:54:29 +0000 (12:54 +0100)]
Paper on Lookahead TDFA: fixed captions and ran through aspell.
Ulya Trofimovich [Sat, 5 Aug 2017 08:33:30 +0000 (09:33 +0100)]
Paper on Lookahead TDFA: added benchmark results and graphs.
Ulya Trofimovich [Fri, 4 Aug 2017 08:57:58 +0000 (09:57 +0100)]
Paper on Lookahead TDFA: reformatted examples.
Ulya Trofimovich [Fri, 4 Aug 2017 08:52:14 +0000 (09:52 +0100)]
Tweaked CXXFLAGS in asan build script.