\ escape (enable or disable meta character meaning)
| alternation
(...) group
- [...] character class
+ [...] character class
2. Characters
\w word character
Not Unicode:
- alphanumeric, "_" and multibyte char.
+ alphanumeric, "_" and multibyte char.
Unicode:
General_Category -- (Letter|Mark|Number|Connector_Punctuation)
\t, \n, \v, \f, \r, \x20
Unicode:
- 0009, 000A, 000B, 000C, 000D, 0085(NEL),
+ 0009, 000A, 000B, 000C, 000D, 0085(NEL),
General_Category -- Line_Separator
-- Paragraph_Separator
-- Space_Separator
+ works on all encodings
Alnum, Alpha, Blank, Cntrl, Digit, Graph, Lower,
- Print, Punct, Space, Upper, XDigit, Word, ASCII,
+ Print, Punct, Space, Upper, XDigit, Word, ASCII
+ works on EUC_JP, Shift_JIS
Hiragana, Katakana
?? 1 or 0 times
*? 0 or more times
+? 1 or more times
- {n,m}? at least n but not more than m times
+ {n,m}? at least n but not more than m times
{n,}? at least n times
{,n}? at least 0 but not more than n times (== {0,n}?)
\A beginning of string
\Z end of string, or before newline at the end
\z end of string
- \G matching start position
+ \G matching start position
6. Character class
x-y range from x to y
[...] set (character class in character class)
..&&.. intersection (low precedence at the next of ^)
-
+
ex. [a-w&&[^c-g]z] ==> ([a-w] AND ([^c-g] OR z)) ==> [abh-w]
* If you want to use '[', '-', ']' as a normal character
alternatives only.
ex. (?<=a|bc) is OK. (?<=aaa(?:b|cd)) is not allowed.
- In negative-look-behind, captured group isn't allowed,
+ In negative-look-behind, captured group isn't allowed,
but shy group(?:) is allowed.
(?>subexp) atomic group
a subexp with a large number is referred to preferentially.
(When not matched, a group of the small number is referred to.)
- * Back reference by group number is forbidden if named group is defined
- in the pattern and ONIG_OPTION_CAPTURE_GROUP is not setted.
+ * Back reference by group number is forbidden if named group is defined
+ in the pattern and ONIG_OPTION_CAPTURE_GROUP is not set.
back reference with nest level
\k'name+level'
\k'name-level'
- Destinate relative nest level from back reference position.
+ Destine relative nest level from back reference position.
ex 1.
(?<name>a|b\g<name>c) => OK
* Call by group number is forbidden if named group is defined in the pattern
- and ONIG_OPTION_CAPTURE_GROUP is not setted.
+ and ONIG_OPTION_CAPTURE_GROUP is not set.
* If the option status of called group is different from calling position
then the group's option is effective.
+ add operations in character class. [], &&
('[' must be escaped as an usual char in character class.)
+ add named group and subexp call.
- + octal or hexadecimal number sequence can be treated as
+ + octal or hexadecimal number sequence can be treated as
a multibyte code char in character class if multibyte encoding
is specified.
(ex. [\xa1\xa2], [\xa1\xa7-\xa4\xa1])
ex. (?:(?i)a|b) is interpreted as (?:(?i:a|b)), not (?:(?i:a)|b).
+ isolated option is not transparent to previous pattern.
ex. a(?i)* is a syntax error pattern.
- + allowed incompleted left brace as an usual string.
+ + allowed incomplete left brace as an usual string.
ex. /{/, /({)/, /a{2,3/ etc...
+ negative POSIX bracket [:^xxxx:] is supported.
+ POSIX bracket [:ascii:] is added.
ex. /\x61/i =~ "A"
+ In the range quantifier, the number of the minimum is omissible.
/a{,n}/ == /a{0,n}/
- The simultanious abbreviation of the number of times of the minimum
+ The simultaneous abbreviation of the number of times of the minimum
and the maximum is not allowed. (/a{,}/)
+ /a{n}?/ is not a non-greedy operator.
/a{n}?/ == /(?:a{n})?/