ChangeLog for PCRE
------------------
-Version 7.2 13-June-07
+Version 7.2 19-June-07
---------------------
1. If the fr_FR locale cannot be found for test 3, try the "french" locale,
pcrecpp::RE("a*?").FullMatch("aaa") does not, and
pcrecpp::RE("a*?\\z").FullMatch("aaa") does again.
+12. If \p or \P was used in non-UTF-8 mode on a character greater than 127
+ it matched the wrong number of bytes.
+
Version 7.1 24-Apr-07
---------------------
------------------------
-Release 7.2 13-Jun-07
+Release 7.2 19-Jun-07
---------------------
WARNING: saved patterns that were compiled by earlier versions of PCRE must be
#define PACKAGE_NAME "PCRE"
/* Define to the full name and version of this package. */
-#define PACKAGE_STRING "PCRE 7.2-RC3"
+#define PACKAGE_STRING "PCRE 7.2"
/* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME "pcre"
/* Define to the version of this package. */
-#define PACKAGE_VERSION "7.2-RC3"
+#define PACKAGE_VERSION "7.2"
/* If you are compiling for a system other than a Unix-like system or
/* Version number of package */
#ifndef VERSION
-#define VERSION "7.2-RC3"
+#define VERSION "7.2"
#endif
/* Define to empty if `const' does not conform to ANSI C. */
Unicode character properties
When PCRE is built with Unicode character property support, three addi-
- tional escape sequences to match character properties are available
- when UTF-8 mode is selected. They are:
+ tional escape sequences that match characters with specific properties
+ are available. When not in UTF-8 mode, these sequences are of course
+ limited to testing characters whose codepoints are less than 256, but
+ they do work in this mode. The extra escape sequences are:
\p{xx} a character with the xx property
\P{xx} a character without the xx property
That is, it matches a character without the "mark" property, followed
by zero or more characters with the "mark" property, and treats the
sequence as an atomic group (see below). Characters with the "mark"
- property are typically accents that affect the preceding character.
+ property are typically accents that affect the preceding character.
+ None of them have codepoints less than 256, so in non-UTF-8 mode \X
+ matches any one character.
Matching characters by Unicode property is not fast, because PCRE has
to search a structure that contains data for over fifteen thousand
REVISION
- Last updated: 13 June 2007
+ Last updated: 19 June 2007
Copyright (c) 1997-2007 University of Cambridge.
------------------------------------------------------------------------------
#define PCRE_MAJOR 7
#define PCRE_MINOR 2
-#define PCRE_PRERELEASE -RC3
-#define PCRE_DATE 2007-06-13
+#define PCRE_PRERELEASE
+#define PCRE_DATE 2007-06-19
/* When an application links to a PCRE DLL in Windows, the symbols that are
imported have to be identified as such. When building PCRE, the appropriate
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject) RRETURN(MATCH_NOMATCH);
- GETCHARINC(c, eptr);
+ GETCHARINCTEST(c, eptr);
}
break;
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject) RRETURN(MATCH_NOMATCH);
- GETCHARINC(c, eptr);
+ GETCHARINCTEST(c, eptr);
prop_category = _pcre_ucp_findprop(c, &prop_chartype, &prop_script);
if ((prop_chartype == ucp_Lu ||
prop_chartype == ucp_Ll ||
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject) RRETURN(MATCH_NOMATCH);
- GETCHARINC(c, eptr);
+ GETCHARINCTEST(c, eptr);
prop_category = _pcre_ucp_findprop(c, &prop_chartype, &prop_script);
if ((prop_category == prop_value) == prop_fail_result)
RRETURN(MATCH_NOMATCH);
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject) RRETURN(MATCH_NOMATCH);
- GETCHARINC(c, eptr);
+ GETCHARINCTEST(c, eptr);
prop_category = _pcre_ucp_findprop(c, &prop_chartype, &prop_script);
if ((prop_chartype == prop_value) == prop_fail_result)
RRETURN(MATCH_NOMATCH);
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject) RRETURN(MATCH_NOMATCH);
- GETCHARINC(c, eptr);
+ GETCHARINCTEST(c, eptr);
prop_category = _pcre_ucp_findprop(c, &prop_chartype, &prop_script);
if ((prop_script == prop_value) == prop_fail_result)
RRETURN(MATCH_NOMATCH);
/^\x{023a}+([^X])/8i
\x{023a}\x{2c65}X
+
+/Check property support in non-UTF-8 mode/
+/\p{L}{4}/
+ 123abcdefg
+ 123abc\xc4\xc5zz
+
/ End of testinput6 /
/^\x{023a}+([^X])/8i
\x{023a}\x{2c65}X
+/Check property support in non-UTF-8 mode/
+
+/\p{L}{4}/
+ 123abcdefg
+ 123abc\xc4\xc5zz
+
/ End /
\x{023a}\x{2c65}X
0: \x{23a}\x{2c65}
1: \x{2c65}
+
+/Check property support in non-UTF-8 mode/
+/\p{L}{4}/
+ 123abcdefg
+ 0: abcd
+ 123abc\xc4\xc5zz
+ 0: abc\xc4
+
/ End of testinput6 /
\x{023a}\x{2c65}X
0: \x{23a}\x{2c65}
+/Check property support in non-UTF-8 mode/
+
+/\p{L}{4}/
+ 123abcdefg
+ 0: abcd
+ 123abc\xc4\xc5zz
+ 0: abc\xc4
+
/ End /