]> granicus.if.org Git - re2c/log
re2c
5 years agoDeduplicated tag history definitions used in re2c and libre2c.
Ulya Trofimovich [Fri, 1 Mar 2019 10:55:38 +0000 (10:55 +0000)]
Deduplicated tag history definitions used in re2c and libre2c.

5 years agoAvoid allocating new scratch buffer for each closure, reuse the same buffer.
Ulya Trofimovich [Thu, 28 Feb 2019 10:57:10 +0000 (10:57 +0000)]
Avoid allocating new scratch buffer for each closure, reuse the same buffer.

5 years agoExit early from loops in GOR1.
Ulya Trofimovich [Wed, 27 Feb 2019 11:46:55 +0000 (11:46 +0000)]
Exit early from loops in GOR1.

5 years agolibre2c: updated benchmark.
Ulya Trofimovich [Wed, 27 Feb 2019 11:28:33 +0000 (11:28 +0000)]
libre2c: updated benchmark.

5 years agolibre2c: use a more efficient algorithm for computing POSIX precedence matrix.
Ulya Trofimovich [Wed, 27 Feb 2019 10:50:07 +0000 (10:50 +0000)]
libre2c: use a more efficient algorithm for computing POSIX precedence matrix.

The old alrorithm was straightforward: it computed precedence for each
pair of core closure items. This has worst-case cubic complexity in the
size of TNFA, because closure can have O(N) states and tag history length
can be also O(N). One example is ((a?){1,1000})*.

The new algorithm avoids that by computing parecedence for all pairs of
core items in one traversal of the shortest path tree constructed by
closure. It avoids traversing deadend branches (those that were cancelled
by search because a better path was found) by marking used nodes of the
tree from each leaf down to the root. The new algorithm has worst-case
quadratic behaviour. It can be slightly slower for simple cases when the
number of core items is low and histories are short.

5 years agoMoved hot functions to header to enable inlining.
Ulya Trofimovich [Wed, 27 Feb 2019 10:44:15 +0000 (10:44 +0000)]
Moved hot functions to header to enable inlining.

5 years agolibre2c (benchmark): do not try to link with re2 if it's not available.
Ulya Trofimovich [Wed, 20 Feb 2019 17:29:03 +0000 (17:29 +0000)]
libre2c (benchmark): do not try to link with re2 if it's not available.

5 years agolibre2c: found pathological case for constant-memory POSIX algorithms.
Ulya Trofimovich [Wed, 20 Feb 2019 17:00:47 +0000 (17:00 +0000)]
libre2c: found pathological case for constant-memory POSIX algorithms.

The regexp is ((a?){1,200})*, and the input string is just "a".

Takes quibic time in the size of counter. This is caused by quadratic-
time computation of precedence matrix on each step (the number of TNFA
states in the closure approaches TNFA size), multiplied by the length
of compared histores (which also approaches TNFA size). Trie-based
algorithms are not affected, but they consume memory proportional to
the length of the input string, and so are also not practical.

5 years agolibre2c: added simplistic benchmark to compare different libre2c algorithms and re2.
Ulya Trofimovich [Wed, 20 Feb 2019 14:46:29 +0000 (14:46 +0000)]
libre2c: added simplistic benchmark to compare different libre2c algorithms and re2.

5 years agoSimplification of GOR1 (following similar changes in libre2c).
Ulya Trofimovich [Tue, 19 Feb 2019 23:10:37 +0000 (23:10 +0000)]
Simplification of GOR1 (following similar changes in libre2c).

5 years agoNo need to fully unwind tag histories when calculating POSIX precedence.
Ulya Trofimovich [Tue, 19 Feb 2019 21:51:43 +0000 (21:51 +0000)]
No need to fully unwind tag histories when calculating POSIX precedence.

Histories are stored in a trie. Each history is represented by an index
to its tail in the trie. In order to compare two histories, we can simply
unwind both indices, step by step, until they coincide or root index is
reached.

Related commits in libre2c:
    062da2dca6908c0f6da54711b897e3bafca2628c
    2a7158b401d7e925696fc1b0c8da37a59e364dd4

5 years agolibre2c: deleted unused buffers.
Ulya Trofimovich [Tue, 19 Feb 2019 21:44:38 +0000 (21:44 +0000)]
libre2c: deleted unused buffers.

5 years agoUse signed integers as tag history indices (this way root index -1 is the least one).
Ulya Trofimovich [Tue, 19 Feb 2019 21:40:46 +0000 (21:40 +0000)]
Use signed integers as tag history indices (this way root index -1 is the least one).

5 years agorun_tests.sh: ignore trailing whitespace in test diffs if compiler name matches ...
Ulya Trofimovich [Tue, 19 Feb 2019 17:52:36 +0000 (17:52 +0000)]
run_tests.sh: ignore trailing whitespace in test diffs if compiler name matches "mingw".

We already ignore trailing whitespace when running in Wine, but binaries
that built with Mingw and run on windows (not in Wine) also need this option.

5 years agoConstified std::map comparator (apparently MSVC fails to compile otherwise).
Ulya Trofimovich [Tue, 19 Feb 2019 16:51:02 +0000 (16:51 +0000)]
Constified std::map comparator (apparently MSVC fails to compile otherwise).

Problem reported by Denis Naumov.

5 years agoCorrectly handle multi-character newlines (e.g. CR LF) when parsing #line directives.
Ulya Trofimovich [Tue, 19 Feb 2019 16:32:15 +0000 (16:32 +0000)]
Correctly handle multi-character newlines (e.g. CR LF) when parsing #line directives.

Bug report thanks to Denis Naumov.

5 years agolibre2c: avoid -Wreturn compiler warnings (no return value from non-void function).
Ulya Trofimovich [Tue, 19 Feb 2019 15:59:20 +0000 (15:59 +0000)]
libre2c: avoid -Wreturn compiler warnings (no return value from non-void function).

5 years agolibre2c: use core state indices when possible.
Ulya Trofimovich [Tue, 19 Feb 2019 14:22:40 +0000 (14:22 +0000)]
libre2c: use core state indices when possible.

5 years agolibre2c: inlined hot function.
Ulya Trofimovich [Tue, 19 Feb 2019 13:50:57 +0000 (13:50 +0000)]
libre2c: inlined hot function.

5 years agolibre2c: updating marker of last matched rule after closure has been constructed...
Ulya Trofimovich [Tue, 19 Feb 2019 13:47:01 +0000 (13:47 +0000)]
libre2c: updating marker of last matched rule after closure has been constructed (non-constant-memory version).

See commit 9998f616a314d635adbfea360d9882276c237e6d (same for constant-memory version).

5 years agolibre2c: compare histories in a trie without reconstructing them (non-constant-memory...
Ulya Trofimovich [Tue, 19 Feb 2019 13:32:34 +0000 (13:32 +0000)]
libre2c: compare histories in a trie without reconstructing them (non-constant-memory version.)

See commit 2a7158b401d7e925696fc1b0c8da37a59e364dd4.

5 years agolibre2c: added forgotten file.
Ulya Trofimovich [Tue, 19 Feb 2019 12:43:27 +0000 (12:43 +0000)]
libre2c: added forgotten file.

5 years agolibre2c: rearranged code to avoid checking configuration type when computing preceden...
Ulya Trofimovich [Tue, 19 Feb 2019 12:12:46 +0000 (12:12 +0000)]
libre2c: rearranged code to avoid checking configuration type when computing precedence matrix.

5 years agolibre2c: ensure TNFA simulation algorithms clean up context before exiting.
Ulya Trofimovich [Tue, 19 Feb 2019 12:03:12 +0000 (12:03 +0000)]
libre2c: ensure TNFA simulation algorithms clean up context before exiting.

5 years agolibre2c: use templates to avoid dynamic dispatch between closure algorithms.
Ulya Trofimovich [Tue, 19 Feb 2019 08:49:27 +0000 (08:49 +0000)]
libre2c: use templates to avoid dynamic dispatch between closure algorithms.

5 years agolibre2c: hid some of the implementation details behind a pointer in regex_t struct.
Ulya Trofimovich [Mon, 18 Feb 2019 22:48:00 +0000 (22:48 +0000)]
libre2c: hid some of the implementation details behind a pointer in regex_t struct.

5 years agolibre2c: simplified construction of configurations.
Ulya Trofimovich [Mon, 18 Feb 2019 22:15:50 +0000 (22:15 +0000)]
libre2c: simplified construction of configurations.

5 years agolibre2c: added GOR1 algorithm for POSIX closure construction.
Ulya Trofimovich [Mon, 18 Feb 2019 21:44:29 +0000 (21:44 +0000)]
libre2c: added GOR1 algorithm for POSIX closure construction.

5 years agolibre2c: compare histories in a trie without reconstructing them.
Ulya Trofimovich [Mon, 18 Feb 2019 16:42:07 +0000 (16:42 +0000)]
libre2c: compare histories in a trie without reconstructing them.

Previously we used to unwing each history, store it as a linear sequence
in a separate buffer, and then iterate buffers from the beginning of
histories to fing the fork point.

This reconstruction step can be avoided, because we can find fork by
tracing both histories together from tail to root, element by element,
until their indices coincide (this is the fork point. Someone called
thos "two-fingers algorithm".

5 years agolibre2c: use signed indices in tag history (this way root index -1 is the least one).
Ulya Trofimovich [Mon, 18 Feb 2019 16:26:44 +0000 (16:26 +0000)]
libre2c: use signed indices in tag history (this way root index -1 is the least one).

5 years agolibre2c: marked hot function as inline.
Ulya Trofimovich [Mon, 18 Feb 2019 12:46:38 +0000 (12:46 +0000)]
libre2c: marked hot function as inline.

5 years agolibre2c: updating marker of last matched rule after closure has been constructed.
Ulya Trofimovich [Mon, 18 Feb 2019 12:36:40 +0000 (12:36 +0000)]
libre2c: updating marker of last matched rule after closure has been constructed.

5 years agolibre2c: use lookahead to to filter leftmost closure before computing offsets.
Ulya Trofimovich [Mon, 18 Feb 2019 12:27:19 +0000 (12:27 +0000)]
libre2c: use lookahead to  to filter leftmost closure before computing offsets.

5 years agolibre2c: use lookahead to filter POSIX closure before computing offsets and precedenc...
Ulya Trofimovich [Mon, 18 Feb 2019 11:59:55 +0000 (11:59 +0000)]
libre2c: use lookahead to filter POSIX closure before computing offsets and precedence matrix.

This significantly reduces the amount of work on each step.

5 years agolibre2c: implemented constant-memory (non trie-based) TNFA POSIX matching algorithm.
Ulya Trofimovich [Mon, 18 Feb 2019 11:32:30 +0000 (11:32 +0000)]
libre2c: implemented constant-memory (non trie-based) TNFA POSIX matching algorithm.

5 years agolibre2c: renamed a couple of struct fields (cosmetic).
Ulya Trofimovich [Mon, 18 Feb 2019 11:30:57 +0000 (11:30 +0000)]
libre2c: renamed a couple of struct fields (cosmetic).

5 years agoCorrectly set values on main diagonal of POSIX precedence matrix (configuration compa...
Ulya Trofimovich [Mon, 18 Feb 2019 10:52:26 +0000 (10:52 +0000)]
Correctly set values on main diagonal of POSIX precedence matrix (configuration compared to itself).

It doesn't show any difference on the tests, probably because previously
these values were set to zero, which unpacks to symmetrical (though
incorrect) values, and thus doesn't affect comparison.

5 years agolibre2c: added (stub for) constant-memory (non trie-based) TNFA POSIX matching algorithm.
Ulya Trofimovich [Thu, 14 Feb 2019 13:35:58 +0000 (13:35 +0000)]
libre2c: added (stub for) constant-memory (non trie-based) TNFA POSIX matching algorithm.

5 years agolibre2c: use preallocated buffer.
Ulya Trofimovich [Thu, 14 Feb 2019 13:10:42 +0000 (13:10 +0000)]
libre2c: use preallocated buffer.

5 years agolibre2c: added constant-memory (non trie-based) TNFA matching algorithm for leftmost...
Ulya Trofimovich [Thu, 14 Feb 2019 12:56:29 +0000 (12:56 +0000)]
libre2c: added constant-memory (non trie-based) TNFA matching algorithm for leftmost greedy.

5 years agoAssign indices to core TNFA states (those that are final or have transitions on symbols).
Ulya Trofimovich [Thu, 14 Feb 2019 12:52:50 +0000 (12:52 +0000)]
Assign indices to core TNFA states (those that are final or have transitions on symbols).

During epsilon closure constructions, many operations only need to be
performed for core states. This allows to save some memory, e.g.
precedence matrix can be indexed by core states.

5 years agoStructural tags are only needed for POSIX disambiguation, not for POSIX syntax.
Ulya Trofimovich [Thu, 14 Feb 2019 12:51:06 +0000 (12:51 +0000)]
Structural tags are only needed for POSIX disambiguation, not for POSIX syntax.

5 years agoAdded option --verbose that prints a message to stderr in case of success.
Ulya Trofimovich [Wed, 13 Feb 2019 15:27:55 +0000 (15:27 +0000)]
Added option --verbose that prints a message to stderr in case of success.

5 years agoRemoved useless code in tests for misused option arguments.
Ulya Trofimovich [Wed, 13 Feb 2019 15:22:01 +0000 (15:22 +0000)]
Removed useless code in tests for misused option arguments.

5 years agoTweaked error messages for misused option arguments, added some tests for them.
Ulya Trofimovich [Wed, 13 Feb 2019 14:33:39 +0000 (14:33 +0000)]
Tweaked error messages for misused option arguments, added some tests for them.

5 years agoAdded option --location-format <gnu | msvc> (default is gnu).
Ulya Trofimovich [Wed, 13 Feb 2019 13:50:00 +0000 (13:50 +0000)]
Added option --location-format <gnu | msvc> (default is gnu).

This option controls the formatting of locations in error messages
and warnings.

With --location-format gnu:

    filename:line:column: ...

With --location-format msvc:

    filename(line,column): ...

5 years agoMoved files.
Ulya Trofimovich [Wed, 13 Feb 2019 07:23:27 +0000 (07:23 +0000)]
Moved files.

5 years agoDeduplicated filename strings in locations by interning them in a table and using...
Ulya Trofimovich [Tue, 12 Feb 2019 22:53:54 +0000 (22:53 +0000)]
Deduplicated filename strings in locations by interning them in a table and using indices.

5 years agoUnified location handling by grouping relevant information in a struct.
Ulya Trofimovich [Tue, 12 Feb 2019 17:31:25 +0000 (17:31 +0000)]
Unified location handling by grouping relevant information in a struct.

Massive test updates are caused by refined locations for various error
messages and warnings (re2c now reports not only line, but also column).

5 years agoAdded test for EOF rule.
Ulya Trofimovich [Mon, 11 Feb 2019 21:43:52 +0000 (21:43 +0000)]
Added test for EOF rule.

5 years agoUse GNU-style location format 'file:line:column: ...' in warnings and error messages.
Ulya Trofimovich [Mon, 11 Feb 2019 13:11:47 +0000 (13:11 +0000)]
Use GNU-style location format 'file:line:column: ...' in warnings and error messages.

5 years agoAdded build script that uses '-D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC'.
Ulya Trofimovich [Sat, 9 Feb 2019 18:18:48 +0000 (18:18 +0000)]
Added build script that uses '-D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC'.

5 years agoFixed out of bounds read when sorting one-element initial closure.
Ulya Trofimovich [Sat, 9 Feb 2019 17:54:25 +0000 (17:54 +0000)]
Fixed out of bounds read when sorting one-element initial closure.

The error only occurred on some libc implementations, e.g. on debug
glibc (options -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC).

5 years agoFixed use of invalidated iterator after erase on std::map.
Ulya Trofimovich [Fri, 8 Feb 2019 23:15:21 +0000 (23:15 +0000)]
Fixed use of invalidated iterator after erase on std::map.

Problem reported by Denis Naumov.

5 years agoUse per-tag cache of history comparisons to avoid repeated computation.
Ulya Trofimovich [Fri, 8 Feb 2019 16:40:47 +0000 (16:40 +0000)]
Use per-tag cache of history comparisons to avoid repeated computation.

5 years agoUse counting sort for tag histories (the number tags is limited, so it is faster).
Ulya Trofimovich [Fri, 8 Feb 2019 16:08:58 +0000 (16:08 +0000)]
Use counting sort for tag histories (the number tags is limited, so it is faster).

5 years agoUse insertion sort instead of bubble sort (it is slightly faster).
Ulya Trofimovich [Fri, 8 Feb 2019 15:25:54 +0000 (15:25 +0000)]
Use insertion sort instead of bubble sort (it is slightly faster).

5 years agoFixed typo.
Ulya Trofimovich [Fri, 8 Feb 2019 13:58:43 +0000 (13:58 +0000)]
Fixed typo.

5 years agoSpeedup in determinization (using a more efficient way to compare lookahead tags).
Ulya Trofimovich [Fri, 8 Feb 2019 10:59:58 +0000 (10:59 +0000)]
Speedup in determinization (using a more efficient way to compare lookahead tags).

5 years agoYet another speedup in determinization (early exit when comparing history with itself).
Ulya Trofimovich [Thu, 7 Feb 2019 22:40:18 +0000 (22:40 +0000)]
Yet another speedup in determinization (early exit when comparing history with itself).

5 years agoSmall speedup in determinization (by inlining small hot functions).
Ulya Trofimovich [Thu, 7 Feb 2019 22:06:40 +0000 (22:06 +0000)]
Small speedup in determinization (by inlining small hot functions).

5 years agoSmall speedup in determinization by delaying computation of costly conditions.
Ulya Trofimovich [Thu, 7 Feb 2019 21:41:18 +0000 (21:41 +0000)]
Small speedup in determinization by delaying computation of costly conditions.

5 years agolibre2c: added comment.
Ulya Trofimovich [Thu, 7 Feb 2019 11:31:47 +0000 (11:31 +0000)]
libre2c: added comment.

5 years agolibre2c: speedup of POSIX NFA matcher by lazy computation and caching of precedence.
Ulya Trofimovich [Thu, 7 Feb 2019 10:50:35 +0000 (10:50 +0000)]
libre2c: speedup of POSIX NFA matcher by lazy computation and caching of precedence.

A lot of the match time was taken by eagerly computing precedence values
on each step for each pair of closure states. Most of these values are not
needed: RE may be unambigous; or the given input string may be unambigous,
even if there is ambiguity, it may take only a few comparisons to resolve.
All the rest is wasted effort.

We can avoid it by delaying precedence computation until necessary, and
then unwinding all the steps backwards, computing precedence for each step
and caching the computed values (so that the same pair of histories is not
compared twice). It is still the same incremental comparison as with
precedence matrices: we compare step by step, going from the fork frame to
the join frame. The result of comparison on each step is folded to a triple
of numbers and recorded in cache. It is important that we do record each
step, not just the first step, because the next pair of ambiguous histories
may unwind to the same pair of prefixes that was compared before.

For all this to work, it is necessary that we can store all history until
the very end, because at any step we might need to unwind an arbitrary
number of steps back. We also need to address individual subhistories
efficiently in order to use them as keys in the cache. All this is achieved
by storing history in the form of a trie and addressing individual
histories by indices in the trie. We also use trie to compute the resulting
tag values (instead of storing tags in registers at each step).

5 years agolibre2c: small performance improvement in POSIX TNFA matcher.
Ulya Trofimovich [Tue, 5 Feb 2019 17:21:48 +0000 (17:21 +0000)]
libre2c: small performance improvement in POSIX TNFA matcher.

5 years agolibre2c: extracted common parts of TNFA matchers (leftmost and POSIX).
Ulya Trofimovich [Tue, 5 Feb 2019 15:47:01 +0000 (15:47 +0000)]
libre2c: extracted common parts of TNFA matchers (leftmost and POSIX).

5 years agolibre2c: performance optimization: get all tag values in one scan of history.
Ulya Trofimovich [Tue, 5 Feb 2019 14:57:42 +0000 (14:57 +0000)]
libre2c: performance optimization: get all tag values in one scan of history.

5 years agoRenamed libre2c_posix -> libre2c (as it now also supports leftmost greedy semantics).
Ulya Trofimovich [Tue, 5 Feb 2019 13:33:04 +0000 (13:33 +0000)]
Renamed libre2c_posix -> libre2c (as it now also supports leftmost greedy semantics).

5 years agoDifferentiate between "POSIX syntax" and "POSIX semantics" options.
Ulya Trofimovich [Tue, 5 Feb 2019 13:10:56 +0000 (13:10 +0000)]
Differentiate between "POSIX syntax" and "POSIX semantics" options.

This is needed because we may want to use POSIX syntax with leftmost
greedy disambiguation. In that case, we still need parentheses-as-tags
attitude, but we don't want all the overhead and heuristics to spped
up POSIX closure.

libre2c_posix: added test for leftmost TNFA matcher.

5 years agolibre2c_posix: added TNFA matcher with leftmost greedy disambiguation semantics.
Ulya Trofimovich [Tue, 5 Feb 2019 11:05:51 +0000 (11:05 +0000)]
libre2c_posix: added TNFA matcher with leftmost greedy disambiguation semantics.

5 years agolibre2c_posix: small performance improvement in NFA-based matcher.
Ulya Trofimovich [Sat, 2 Feb 2019 19:08:45 +0000 (19:08 +0000)]
libre2c_posix: small performance improvement in NFA-based matcher.

5 years agolibre2c_posix: added NFA-based matcher.
Ulya Trofimovich [Sat, 2 Feb 2019 18:16:32 +0000 (18:16 +0000)]
libre2c_posix: added NFA-based matcher.

5 years agoConstified function parameter.
Ulya Trofimovich [Sat, 2 Feb 2019 18:12:18 +0000 (18:12 +0000)]
Constified function parameter.

5 years agolibre2c_posix: another small improvement in regexec() implementation.
Ulya Trofimovich [Sun, 27 Jan 2019 22:57:25 +0000 (22:57 +0000)]
libre2c_posix: another small improvement in regexec() implementation.

5 years agolibre2c_posix: a small performance improvement in regexec() implementation.
Ulya Trofimovich [Sun, 27 Jan 2019 18:05:23 +0000 (18:05 +0000)]
libre2c_posix: a small performance improvement in regexec() implementation.

5 years agolibre2c_posix: regex_t definition must be exposed to users.
Ulya Trofimovich [Sun, 27 Jan 2019 17:45:37 +0000 (17:45 +0000)]
libre2c_posix: regex_t definition must be exposed to users.

POSIX standard says that:

    The regex_t structure is defined in <regex.h> and contains at
    least the following member: re_nsub (number of parenthesized
    subexpressions).

5 years agolibre2c_posix: enable tag optimizations.
Ulya Trofimovich [Sun, 27 Jan 2019 11:35:18 +0000 (11:35 +0000)]
libre2c_posix: enable tag optimizations.

5 years agolibre2c_posix: extended regex_t structure to hold more submatch data.
Ulya Trofimovich [Sun, 27 Jan 2019 10:23:23 +0000 (10:23 +0000)]
libre2c_posix: extended regex_t structure to hold more submatch data.

Added fields:
    - re_nsub: total number of submatch groups, required by POSIX standard

    - pmatch: buffer for submatch results, usually supplied by the user to
      regexec(), but we allow to do the allocation and storage in regex_t.
      This is convenient for users that have hard time managing memory, e.g.
      java bindings to libre2c_posix.

    - regs: buffer for internal use by regexec(), strored in regex_t to
      avoid repeated memory allocation on each call to regexec() with the
      same regex.

5 years agoAdjusted build system to correctly build DLLs for windows.
Ulya Trofimovich [Sun, 27 Jan 2019 10:06:22 +0000 (10:06 +0000)]
Adjusted build system to correctly build DLLs for windows.

Adjustments:
    - configure.ac:  pass win32-dll option to LT_INIT

    - Makefile_libre2c_posix.am: use -no-undefined in LDFLAGS

    - use slibtool: https://github.com/midipix-project/slibtool for windows
      builds The problem with libtool is that it doesn't allow to link libstdc++
      and libgcc statically, which is necessary to build portable DLLs with
      Mingw. Libtool adds -nostdlib option to LDFLAGS and links some predefined
      objects that pull in dependency on dynamic libstdc++ and libgcc, even in
      the presence of -static-libstdc++ -static-libgcc.

5 years agoUse libtool to build libraries.
Ulya Trofimovich [Sat, 19 Jan 2019 11:57:16 +0000 (11:57 +0000)]
Use libtool to build libraries.

5 years agoFixed operator precedence with --flex-syntax option.
Ulya Trofimovich [Thu, 17 Jan 2019 22:52:04 +0000 (22:52 +0000)]
Fixed operator precedence with --flex-syntax option.

Operator precedence was broken because re2c tried to parse whole strings
of characters at once instead of parsing one character at a time (in
much the same way as it would parse properly quotes string literals in
the original re2c format). This caused ab* being parsed as (ab)*, which
is clearly wrong (should be a(b)*).

This fixes bug #242:
    "Operator precedence with --flex-syntax is broken."

5 years agolibre2c_posix: handle empty group () in parser; added more tests.
Ulya Trofimovich [Wed, 16 Jan 2019 21:49:38 +0000 (21:49 +0000)]
libre2c_posix: handle empty group () in parser; added more tests.

5 years agoMake re2c:eof usable with push-model lexers (-f, --storable-state option).
Ulya Trofimovich [Tue, 15 Jan 2019 23:45:53 +0000 (23:45 +0000)]
Make re2c:eof usable with push-model lexers (-f, --storable-state option).

5 years agolibre2c_posix: parse repetition counters in the lexer.
Ulya Trofimovich [Tue, 15 Jan 2019 00:24:22 +0000 (00:24 +0000)]
libre2c_posix: parse repetition counters in the lexer.

5 years agolibre2c_posix: initial support for character classes in parser.
Ulya Trofimovich [Mon, 14 Jan 2019 23:04:23 +0000 (23:04 +0000)]
libre2c_posix: initial support for character classes in parser.

5 years agolibre2c_posix (test): clear error status on expected failures.
Ulya Trofimovich [Sun, 13 Jan 2019 11:11:45 +0000 (11:11 +0000)]
libre2c_posix (test): clear error status on expected failures.

5 years agolibre2c_posix: skip fictive tags in regexec(); handle dot in parser; added more tests.
Ulya Trofimovich [Sun, 13 Jan 2019 10:57:53 +0000 (10:57 +0000)]
libre2c_posix: skip fictive tags in regexec(); handle dot in parser; added more tests.

5 years agolibre2c_posix: use C-array initializers instead of variadic functions for offset...
Ulya Trofimovich [Sat, 12 Jan 2019 20:32:15 +0000 (20:32 +0000)]
libre2c_posix: use C-array initializers instead of variadic functions for offset lists.

Variadic functions cause subtle toolchain-specific errors because integer
literals used to initialize offsets may have different type than the type
expected by variadic function (e.g. int vs long).

5 years agolibre2c_posix (test): a more concise error message.
Ulya Trofimovich [Sat, 12 Jan 2019 14:42:24 +0000 (14:42 +0000)]
libre2c_posix (test): a more concise error message.

5 years agolibre2c_posix: fixed -Wmaybe-uninitialized GCC warning.
Ulya Trofimovich [Sat, 12 Jan 2019 14:19:21 +0000 (14:19 +0000)]
libre2c_posix: fixed -Wmaybe-uninitialized GCC warning.

5 years agolibre2c_posix: fixed memleaks.
Ulya Trofimovich [Sat, 12 Jan 2019 14:18:21 +0000 (14:18 +0000)]
libre2c_posix: fixed memleaks.

5 years agoExperimental: initial implementation of libre2c_posix, POSIX regexp library based...
Ulya Trofimovich [Sat, 12 Jan 2019 12:50:47 +0000 (12:50 +0000)]
Experimental: initial implementation of libre2c_posix, POSIX regexp library based on re2c.

5 years agoPaper: don't forget to pass precedence matrices to initial closure.
Ulya Trofimovich [Fri, 11 Jan 2019 08:16:00 +0000 (08:16 +0000)]
Paper: don't forget to pass precedence matrices to initial closure.

5 years agoMakefile.am: further simplify parser generation rules and avoid copying file to itself.
Ulya Trofimovich [Fri, 11 Jan 2019 00:34:07 +0000 (00:34 +0000)]
Makefile.am: further simplify parser generation rules and avoid copying file to itself.

5 years agoMakefile.am: don't forget to run main test suite on `make check`.
Ulya Trofimovich [Fri, 11 Jan 2019 00:19:01 +0000 (00:19 +0000)]
Makefile.am: don't forget to run main test suite on `make check`.

5 years agoMakefile.am: use more universal rules for autogenerated lexers and parsers.
Ulya Trofimovich [Thu, 10 Jan 2019 23:28:30 +0000 (23:28 +0000)]
Makefile.am: use more universal rules for autogenerated lexers and parsers.

5 years agoAvoid unexpected removal of untracked files and directories.
Ulya Trofimovich [Thu, 10 Jan 2019 22:16:07 +0000 (22:16 +0000)]
Avoid unexpected removal of untracked files and directories.

5 years agoMoved function declarations to the header they belong in.
Ulya Trofimovich [Thu, 10 Jan 2019 22:09:25 +0000 (22:09 +0000)]
Moved function declarations to the header they belong in.

5 years agoUpdated range test.
Ulya Trofimovich [Sun, 6 Jan 2019 16:44:45 +0000 (16:44 +0000)]
Updated range test.

5 years agoUse a simple fixed-size slab allocator for ranges.
Ulya Trofimovich [Sun, 6 Jan 2019 16:25:39 +0000 (16:25 +0000)]
Use a simple fixed-size slab allocator for ranges.