3 Oniguruma ---- (C) K.Kosako <kkosako0@gmail.com>
5 https://github.com/kkos/oniguruma
7 Oniguruma is a regular expressions library.
8 The characteristics of this library is that different character encoding
9 for every regular expression object can be specified.
11 Supported character encodings:
13 ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
14 EUC-JP, EUC-TW, EUC-KR, EUC-CN,
15 Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
16 ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
17 ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
18 ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
20 * GB18030: contributed by KUBO Takehiro
21 * CP1251: contributed by Byte
22 ------------------------------------------------------------
31 Case 1: Unix and Cygwin platform
33 1. autoreconf -vfi (* case: configure script is not found.)
48 onig-config --exec-prefix
52 Case 2: Windows 64/32bit platform (Visual Studio)
54 execute make_win64 or make_win32
56 src/onig_s.lib: static link library
57 src/onig.dll: dynamic link library
59 * test (ASCII/Shift_JIS)
61 2. copy ..\windows\testc.c .
62 3. nmake -f Makefile.windows ctest
64 (I have checked by Visual Studio Community 2015)
70 See doc/RE (or doc/RE.ja for Japanese).
75 Include oniguruma.h in your program. (Oniguruma API)
76 See doc/API for Oniguruma API.
78 If you want to disable UChar type (== unsigned char) definition
79 in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
82 If you want to disable regex_t type definition in oniguruma.h,
83 define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
85 Example of the compiling/linking command line in Unix or Cygwin,
86 (prefix == /usr/local case)
88 cc sample.c -L/usr/local/lib -lonig
91 If you want to use static link library(onig_s.lib) in Win32,
92 add option -DONIG_EXTERN=extern to C compiler.
98 sample/simple.c example of the minimum (Oniguruma API)
99 sample/names.c example of the named group callback.
100 sample/encode.c example of some encodings.
101 sample/listcap.c example of the capture history.
102 sample/posix.c POSIX API sample.
103 sample/sql.c example of the variable meta characters.
104 (SQL-like pattern matching)
107 sample/syntax.c Perl, Java and ASIS syntax test.
108 sample/crnl.c --enable-crnl-as-line-terminator test
113 oniguruma.h Oniguruma API header file. (public)
114 onig-config.in configuration check program template.
116 regenc.h character encodings framework header file.
117 regint.h internal definitions
118 regparse.h internal definitions for regparse.c and regcomp.c
119 regcomp.c compiling and optimization functions
120 regenc.c character encodings framework.
121 regerror.c error message function
122 regext.c extended API functions. (deluxe version API)
123 regexec.c search and match functions
124 regparse.c parsing functions.
125 regsyntax.c pattern syntax functions and built-in syntax definitions.
126 regtrav.c capture history tree data traverse functions.
127 regversion.c version info function.
128 st.h hash table functions header file
129 st.c hash table functions
131 oniggnu.h GNU regex API header file. (public)
132 reggnu.c GNU regex API functions
134 onigposix.h POSIX API header file. (public)
135 regposerr.c POSIX error message function.
136 regposix.c POSIX API functions.
138 mktable.c character type table generator.
139 ascii.c ASCII encoding.
140 euc_jp.c EUC-JP encoding.
141 euc_tw.c EUC-TW encoding.
142 euc_kr.c EUC-KR, EUC-CN encoding.
143 sjis.c Shift_JIS encoding.
144 big5.c Big5 encoding.
145 gb18030.c GB18030 encoding.
146 koi8.c KOI8 encoding.
147 koi8_r.c KOI8-R encoding.
148 cp1251.c CP1251 encoding.
149 iso8859_1.c ISO-8859-1 encoding. (Latin-1)
150 iso8859_2.c ISO-8859-2 encoding. (Latin-2)
151 iso8859_3.c ISO-8859-3 encoding. (Latin-3)
152 iso8859_4.c ISO-8859-4 encoding. (Latin-4)
153 iso8859_5.c ISO-8859-5 encoding. (Cyrillic)
154 iso8859_6.c ISO-8859-6 encoding. (Arabic)
155 iso8859_7.c ISO-8859-7 encoding. (Greek)
156 iso8859_8.c ISO-8859-8 encoding. (Hebrew)
157 iso8859_9.c ISO-8859-9 encoding. (Latin-5 or Turkish)
158 iso8859_10.c ISO-8859-10 encoding. (Latin-6 or Nordic)
159 iso8859_11.c ISO-8859-11 encoding. (Thai)
160 iso8859_13.c ISO-8859-13 encoding. (Latin-7 or Baltic Rim)
161 iso8859_14.c ISO-8859-14 encoding. (Latin-8 or Celtic)
162 iso8859_15.c ISO-8859-15 encoding. (Latin-9 or West European with Euro)
163 iso8859_16.c ISO-8859-16 encoding.
164 (Latin-10 or South-Eastern European with Euro)
165 utf8.c UTF-8 encoding.
166 utf16_be.c UTF-16BE encoding.
167 utf16_le.c UTF-16LE encoding.
168 utf32_be.c UTF-32BE encoding.
169 utf32_le.c UTF-32LE encoding.
170 unicode.c Unicode information data.
172 win32/Makefile Makefile for Win32 (VC++)
173 win32/config.h config.h for Win32
179 ? case fold flag: Katakana <-> Hiragana.
180 ? add ONIG_OPTION_NOTBOS/NOTEOS. (\A, \z, \Z)
182 ?? implement syntax behavior ONIG_SYN_CONTEXT_INDEP_ANCHORS.
183 ?? transmission stopper. (return ONIG_STOP from match_at())
185 and I'm thankful to Akinori MUSHA.
188 Mail Address: K.Kosako <kkosako0@gmail.com>