Ulya Trofimovich [Wed, 25 Mar 2015 16:25:10 +0000 (16:25 +0000)]
Fixed bug #57: Wrong result only if another rule is present
When making regexp alternative, only 'RegExp::PRIVATE' attribute
should be propagated. Propagating 'RegExp::SHARED' attribute is
a mistake, as can be observed from the following example:
name = "smth1";
"smth2" | name | "smth3" { ... }
name { ... }
Here, 'name' must have 'RegExp::PRIVATE' attribute, but it gets
broken after alternation. See #bug57 or test 'bug57.re' for full
working example.
Ulya Trofimovich [Mon, 23 Feb 2015 13:30:42 +0000 (13:30 +0000)]
Added tests from PHP repository: https://github.com/php/php-src
Test results are almost identical to re2c-0.13.6
(there're some few changes, I believe they are due to
commit 255262b02928d3f38c00dd91952e3253c11c78f1 and
completely harmless).
Ulya Trofimovich [Sun, 18 Jan 2015 14:12:16 +0000 (14:12 +0000)]
Replaced "YYHAS (n)" with "YYEOI (n)".
The actual meaning of this primitive is to check if
there's not enough characters left in the input stream,
e.g. "(YYLIMIT - YYCURSOR) < n" or whatever else.
Ulya Trofimovich [Sun, 18 Jan 2015 13:47:51 +0000 (13:47 +0000)]
Added tests for "--input custom".
This implied modifying runtests.sh, as it couldn't handle
test names of the form "basename.--long-switch.re":
it inserted '-' in front of all switches.
Ulya Trofimovich [Tue, 13 Jan 2015 15:30:01 +0000 (15:30 +0000)]
A little cleanup of new input API:
- moved enum and pretty-printing functions to a class
- renamed files 'input.{h,cc}' to 'input_api.{h,cc}'
- for "--input istream": moved input position increment to 'stmt_restorectx'
- main.cc: removed useless include
Double-escape special characters for dot.
Example:
17 -> 18 [label="\n"]
results in an "unlabeled" arrow in the rendered graph, but
17 -> 18 [label="\\n"]
is ok.
Ulya Trofimovich [Fri, 22 Aug 2014 20:15:11 +0000 (23:15 +0300)]
Alternation of 'RegExp's should preserve 'ins_access' attribute.
When one builds 'AltOp' from two 'RegExp's, one sometimes has to
break these 'RegExp's in pieces in order to merge their common prefix.
In such cases, if one of the original 'RegExp's has 'ins_access'
set to 'PRIVATE', it is lost (defaults to 'SHARED') after alternation.
This commit fixes Gentoo bug https://bugs.gentoo.org/show_bug.cgi?id=518904.
Fixed compile error for freebsd5 (found by Sergei Trofimovich).
Sample error:
parser.y: In function `void re2c::parse(re2c::Scanner&, std::ostream&, std::ostream*)':
parser.y:564: error: `yyparse' undeclared (first use this function)
When re2c encounters invalis code point (e.g., surrogate in Unicode),
it acts with regard to current encoding policy:
'fail' - fail with error;
'substitute' - silently substitute offending code point with
error code point;
'ignore' - ignore offending code point, consider it valid.
Fail, if someone tries to set non-ASCII encoding
when another non-ASCII encoding is already set.
If encoding has been set successfully, it is
guaranteed to be valid.
Ulya Fokanova [Mon, 20 Jan 2014 16:07:42 +0000 (19:07 +0300)]
Fixed segfault: regexp [^]* with 'bFlag' enabled.
Changed 'unmap' function: now spans, that must be unmapped,
are always binded to upper adjacent span (if such span exists).
The previous version of 'unmap' used to bind unmapped spans
either to upper, or to lower adjacent span. Because of this change,
some tests need to be updated.
Ulya Fokanova [Fri, 10 Jan 2014 15:29:12 +0000 (18:29 +0300)]
Replaced 'echo' with 'printf' in generation of switches for test.
'echo' doesn't distinguish between options and arguments,
so in cases like 'echo "-e"' is outputs empty string. It
results in dropping re2c's "-e" flag in tests with names
like "test.e.re".
Ulya Fokanova [Wed, 8 Jan 2014 09:40:44 +0000 (12:40 +0300)]
Moved all encoding-related stuff to separate class.
This class stores only encoding flags.
Everything else (number of code points,
number of different characters in input
stream, size of symbol, size of character,
etc.) is generated on the fly.