4 https://github.com/kkos/oniguruma
7 --------------------------
8 **CVE-2017-9224, CVE-2017-9225, CVE-2017-9226**
9 **CVE-2017-9227, CVE-2017-9228, CVE-2017-9229**
11 Oniguruma is a modern and flexible regular expressions library. It
12 encompasses features from different regular expression implementations
13 that traditionally exist in different languages. It comes close to
14 being a complete superset of all regular expression features found
15 in other regular expression implementations.
18 * Character encoding can be specified per regular expression object.
19 * Several regular expression types are supported:
29 Supported character encodings:
31 ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
32 EUC-JP, EUC-TW, EUC-KR, EUC-CN,
33 Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
34 ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
35 ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
36 ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
38 * GB18030: contributed by KUBO Takehiro
39 * CP1251: contributed by Byte
42 New feature of version 6.8.0
43 --------------------------
45 * NEW API: onig_search_with_param(), onig_match_with_param()
46 * NEW: Callouts of contents (?{....}) (?{{....}})
47 * NEW: Callouts of name (*name) (*name\[tag](a,b...))
48 * NEW: Builtin callout functions (*FAIL) (*SUCCESS) (*ABORT) (*ERROR(n)) (*COUNT)
49 * Retry-limit-in-match function enabled by default
50 * NEW: configure option --enable-posix-api=no (* enabled by default)
52 (* Callout functions are exprimental level and undocumented now)
55 New feature of version 6.7.1
56 --------------------------
58 * NEW: Mechanism of retry-limit-in-match (* disabled by default)
61 New feature of version 6.7.0
62 --------------------------
64 * NEW: hexadecimal codepoint \uHHHH
65 * NEW: add ONIG_SYNTAX_ONIGURUMA (== ONIG_SYNTAX_DEFAULT)
66 * Disabled \N and \O on ONIG_SYNTAX_RUBY
67 * Reduced size of object file
70 New feature of version 6.6.0
71 --------------------------
73 * NEW: ASCII only mode options for character type/property (?WDSP)
74 * NEW: Extended Grapheme Cluster boundary \y, \Y (*original)
75 * NEW: Extended Grapheme Cluster \X
76 * Range-clear (Absent-clear) operator restores previous range in retractions.
79 New feature of version 6.5.0
80 --------------------------
83 * NEW: \R (general newline) \N (no newline)
84 * NEW: \O (true anychar)
85 * NEW: if-then-else (?(...)...\|...)
86 * NEW: Backreference validity checker (?(xxx)) (*original)
87 * NEW: Absent repeater (?~absent) \[is equal to (?\~\|absent|\O*)]
88 * NEW: Absent expression (?~|absent|expr) (*original)
89 * NEW: Absent stopper (?~|absent) (*original)
92 New feature of version 6.4.0
93 --------------------------
95 * Fix fatal problem of endless repeat on Windows
96 * NEW: call zero (call the total regexp) \g<0>
97 * NEW: relative backref/call by positive number \k<+n>, \g<+n>
100 New feature of version 6.3.0
101 --------------------------
103 * NEW: octal codepoint \o{.....}
106 New feature of version 6.1.2
107 --------------------------
109 * allow word bound, word begin and word end in look-behind.
110 * NEW option: ONIG_OPTION_CHECK_VALIDITY_OF_STRING
112 New feature of version 6.1
113 --------------------------
116 * NEW API: onig_scan()
118 New feature of version 6.0
119 --------------------------
121 * Update Unicode 8.0 Property/Case-folding
122 * NEW API: onig_unicode_define_user_property()
134 ### Case 1: Unix and Cygwin platform
136 1. autoreconf -vfi (* case: configure script is not found.)
146 * configuration check
151 onig-config --exec-prefix
155 ### Case 2: Windows 64/32bit platform (Visual Studio)
157 execute make_win64 or make_win32
159 onig_s.lib: static link library
160 onig.dll: dynamic link library
162 * test (ASCII/Shift_JIS)
165 2. copy ..\windows\testc.c .
166 3. nmake -f Makefile.windows ctest
168 (I have checked by Visual Studio Community 2015)
175 See [doc/RE](doc/RE) or [doc/RE.ja](doc/RE.ja) for Japanese.
181 Include oniguruma.h in your program. (Oniguruma API)
182 See doc/API for Oniguruma API.
184 If you want to disable UChar type (== unsigned char) definition
185 in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
188 If you want to disable regex_t type definition in oniguruma.h,
189 define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
191 Example of the compiling/linking command line in Unix or Cygwin,
192 (prefix == /usr/local case)
194 cc sample.c -L/usr/local/lib -lonig
197 If you want to use static link library(onig_s.lib) in Win32,
198 add option -DONIG_EXTERN=extern to C compiler.
206 |:---------------------|:-----------------------------------------|
207 |sample/simple.c |example of the minimum (Oniguruma API) |
208 |sample/names.c |example of the named group callback. |
209 |sample/encode.c |example of some encodings. |
210 |sample/listcap.c |example of the capture history. |
211 |sample/posix.c |POSIX API sample. |
212 |sample/scan.c |example of using onig_scan(). |
213 |sample/sql.c |example of the variable meta characters. |
214 |sample/user_property.c|example of user defined Unicode property. |
215 |sample/callout.c |example of callouts. |
221 |:------------------|:--------------------------------------|
222 |sample/syntax.c |Perl, Java and ASIS syntax test. |
223 |sample/crnl.c |--enable-crnl-as-line-terminator test |
231 |:------------------|:-------------------------------------------------------|
232 |oniguruma.h |Oniguruma API header file (public) |
233 |onig-config.in |configuration check program template |
234 |regenc.h |character encodings framework header file |
235 |regint.h |internal definitions |
236 |regparse.h |internal definitions for regparse.c and regcomp.c |
237 |regcomp.c |compiling and optimization functions |
238 |regenc.c |character encodings framework |
239 |regerror.c |error message function |
240 |regext.c |extended API functions (deluxe version API) |
241 |regexec.c |search and match functions |
242 |regparse.c |parsing functions. |
243 |regsyntax.c |pattern syntax functions and built-in syntax definitions|
244 |regtrav.c |capture history tree data traverse functions |
245 |regversion.c |version info function |
246 |st.h |hash table functions header file |
247 |st.c |hash table functions |
248 |oniggnu.h |GNU regex API header file (public) |
249 |reggnu.c |GNU regex API functions |
250 |onigposix.h |POSIX API header file (public) |
251 |regposerr.c |POSIX error message function |
252 |regposix.c |POSIX API functions |
253 |mktable.c |character type table generator |
254 |ascii.c |ASCII encoding |
255 |euc_jp.c |EUC-JP encoding |
256 |euc_tw.c |EUC-TW encoding |
257 |euc_kr.c |EUC-KR, EUC-CN encoding |
258 |sjis.c |Shift_JIS encoding |
259 |big5.c |Big5 encoding |
260 |gb18030.c |GB18030 encoding |
261 |koi8.c |KOI8 encoding |
262 |koi8_r.c |KOI8-R encoding |
263 |cp1251.c |CP1251 encoding |
264 |iso8859_1.c |ISO-8859-1 (Latin-1) |
265 |iso8859_2.c |ISO-8859-2 (Latin-2) |
266 |iso8859_3.c |ISO-8859-3 (Latin-3) |
267 |iso8859_4.c |ISO-8859-4 (Latin-4) |
268 |iso8859_5.c |ISO-8859-5 (Cyrillic) |
269 |iso8859_6.c |ISO-8859-6 (Arabic) |
270 |iso8859_7.c |ISO-8859-7 (Greek) |
271 |iso8859_8.c |ISO-8859-8 (Hebrew) |
272 |iso8859_9.c |ISO-8859-9 (Latin-5 or Turkish) |
273 |iso8859_10.c |ISO-8859-10 (Latin-6 or Nordic) |
274 |iso8859_11.c |ISO-8859-11 (Thai) |
275 |iso8859_13.c |ISO-8859-13 (Latin-7 or Baltic Rim) |
276 |iso8859_14.c |ISO-8859-14 (Latin-8 or Celtic) |
277 |iso8859_15.c |ISO-8859-15 (Latin-9 or West European with Euro) |
278 |iso8859_16.c |ISO-8859-16 (Latin-10) |
279 |utf8.c |UTF-8 encoding |
280 |utf16_be.c |UTF-16BE encoding |
281 |utf16_le.c |UTF-16LE encoding |
282 |utf32_be.c |UTF-32BE encoding |
283 |utf32_le.c |UTF-32LE encoding |
284 |unicode.c |common codes of Unicode encoding |
285 |unicode_fold_data.c|Unicode folding data |
286 |win32/Makefile |Makefile for Win32 (VC++) |
287 |win32/config.h |config.h for Win32 |