3 Oniguruma ---- (C) K.Kosako <kkosako0@gmail.com>
5 https://github.com/kkos/oniguruma
7 Oniguruma is a regular expressions library.
8 The characteristics of this library is that different character encoding
9 for every regular expression object can be specified.
11 Supported character encodings:
13 ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
14 EUC-JP, EUC-TW, EUC-KR, EUC-CN,
15 Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
16 ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
17 ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
18 ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
20 * GB18030: contributed by KUBO Takehiro
21 * CP1251: contributed by Byte
22 ------------------------------------------------------------
31 Case 1: Unix and Cygwin platform
50 onig-config --exec-prefix
54 Case 2: Win32 platform (VC++)
56 1. copy win32\Makefile src\Makefile
57 2. copy win32\config.h src\config.h
61 onig_s.lib: static link library
62 onig.dll: dynamic link library
64 * test (ASCII/Shift_JIS)
65 1. copy win32\testc.c src\testc.c
73 See doc/RE (or doc/RE.ja for Japanese).
78 Include oniguruma.h in your program. (Oniguruma API)
79 See doc/API for Oniguruma API.
81 If you want to disable UChar type (== unsigned char) definition
82 in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
85 If you want to disable regex_t type definition in oniguruma.h,
86 define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
88 Example of the compiling/linking command line in Unix or Cygwin,
89 (prefix == /usr/local case)
91 cc sample.c -L/usr/local/lib -lonig
94 If you want to use static link library(onig_s.lib) in Win32,
95 add option -DONIG_EXTERN=extern to C compiler.
101 sample/simple.c example of the minimum (Oniguruma API)
102 sample/names.c example of the named group callback.
103 sample/encode.c example of some encodings.
104 sample/listcap.c example of the capture history.
105 sample/posix.c POSIX API sample.
106 sample/sql.c example of the variable meta characters.
107 (SQL-like pattern matching)
110 sample/syntax.c Perl, Java and ASIS syntax test.
111 sample/crnl.c --enable-crnl-as-line-terminator test
116 oniguruma.h Oniguruma API header file. (public)
117 onig-config.in configuration check program template.
119 regenc.h character encodings framework header file.
120 regint.h internal definitions
121 regparse.h internal definitions for regparse.c and regcomp.c
122 regcomp.c compiling and optimization functions
123 regenc.c character encodings framework.
124 regerror.c error message function
125 regext.c extended API functions. (deluxe version API)
126 regexec.c search and match functions
127 regparse.c parsing functions.
128 regsyntax.c pattern syntax functions and built-in syntax definitions.
129 regtrav.c capture history tree data traverse functions.
130 regversion.c version info function.
131 st.h hash table functions header file
132 st.c hash table functions
134 oniggnu.h GNU regex API header file. (public)
135 reggnu.c GNU regex API functions
137 onigposix.h POSIX API header file. (public)
138 regposerr.c POSIX error message function.
139 regposix.c POSIX API functions.
141 mktable.c character type table generator.
142 ascii.c ASCII encoding.
143 euc_jp.c EUC-JP encoding.
144 euc_tw.c EUC-TW encoding.
145 euc_kr.c EUC-KR, EUC-CN encoding.
146 sjis.c Shift_JIS encoding.
147 big5.c Big5 encoding.
148 gb18030.c GB18030 encoding.
149 koi8.c KOI8 encoding.
150 koi8_r.c KOI8-R encoding.
151 cp1251.c CP1251 encoding.
152 iso8859_1.c ISO-8859-1 encoding. (Latin-1)
153 iso8859_2.c ISO-8859-2 encoding. (Latin-2)
154 iso8859_3.c ISO-8859-3 encoding. (Latin-3)
155 iso8859_4.c ISO-8859-4 encoding. (Latin-4)
156 iso8859_5.c ISO-8859-5 encoding. (Cyrillic)
157 iso8859_6.c ISO-8859-6 encoding. (Arabic)
158 iso8859_7.c ISO-8859-7 encoding. (Greek)
159 iso8859_8.c ISO-8859-8 encoding. (Hebrew)
160 iso8859_9.c ISO-8859-9 encoding. (Latin-5 or Turkish)
161 iso8859_10.c ISO-8859-10 encoding. (Latin-6 or Nordic)
162 iso8859_11.c ISO-8859-11 encoding. (Thai)
163 iso8859_13.c ISO-8859-13 encoding. (Latin-7 or Baltic Rim)
164 iso8859_14.c ISO-8859-14 encoding. (Latin-8 or Celtic)
165 iso8859_15.c ISO-8859-15 encoding. (Latin-9 or West European with Euro)
166 iso8859_16.c ISO-8859-16 encoding.
167 (Latin-10 or South-Eastern European with Euro)
168 utf8.c UTF-8 encoding.
169 utf16_be.c UTF-16BE encoding.
170 utf16_le.c UTF-16LE encoding.
171 utf32_be.c UTF-32BE encoding.
172 utf32_le.c UTF-32LE encoding.
173 unicode.c Unicode information data.
175 win32/Makefile Makefile for Win32 (VC++)
176 win32/config.h config.h for Win32
182 ? case fold flag: Katakana <-> Hiragana.
183 ? add ONIG_OPTION_NOTBOS/NOTEOS. (\A, \z, \Z)
185 ?? implement syntax behavior ONIG_SYN_CONTEXT_INDEP_ANCHORS.
186 ?? transmission stopper. (return ONIG_STOP from match_at())
188 and I'm thankful to Akinori MUSHA.
191 Mail Address: K.Kosako <kkosako0@gmail.com>