From: helly Date: Sun, 26 Jun 2005 12:45:32 +0000 (+0000) Subject: - Add MSVC.NET build files X-Git-Tag: 0.13.6~631 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=a06f9e9d6f6ac74c6b8dd446f1aceba479c79dde;p=re2c - Add MSVC.NET build files - Add html documents --- diff --git a/.cvsignore b/.cvsignore index 1e445e38..9f35b76d 100644 --- a/.cvsignore +++ b/.cvsignore @@ -37,3 +37,6 @@ version.h *.spec .#* run_tests.sh +Release +re2c.ncb +re2c.suo diff --git a/htdocs/index.html b/htdocs/index.html new file mode 100755 index 00000000..18682a2e --- /dev/null +++ b/htdocs/index.html @@ -0,0 +1,265 @@ + + + + +re2c Home + + +

re2c

+re2c is a great tool and has been unmaintained for quite some time, +and in fact doesn't even compile with recent versions of gcc. I've used +re2c in a few of my projects (e.g. OpenWBEM), and I have an interest in lexers +(I wrote a dynamic lexer called slex as an example for the Spirit parser +framework (see http://spirit.sf.net/)). +I didn't want to see it suffer bit-rot as there are a few bugs which need to be +fixed as well as some new features that would be nice to add. So, after +trying to contact either Peter Bumbulis or Brian Young, I decided to adopt the +project and use SourceForge.net to host it.
+
+I very much welcome anyone who would like to contribute to the project, either +as a developer with CVS access or by simply sending patches, bug reports, or +suggestions for improvement. I have created a mailing list: re2c-general +at lists dot sourceforge dot net which should be used for all communication +about re2c.
+
+Please use the SourceForge +facilities to download re2c, report bugs, subscribe to the mailing list, +etc.
+
+You can view the manual online here.
+
+Dan Nuffer (nuffer@users.sourceforge.net)
+
+re2c is hosted at
+

+
+Other re2c links: +

yasm is a tool which uses re2c, they created a C version and did some good +fixes which I incorporated. http://cvs.tortall.net/cgi-bin/viewcvs.cgi/yasm/tools/re2c/
FreeBSD page on the re2c ports package. http://www.freshports.org/devel/re2c/
Paper on re2c. http://citeseer.nj.nec.com/cowan94rec.html
Also the debian and Gentoo Linux distributions have a package of re2c.

Changelog

2005-06-26: 0.9.8 released

Fixed code generation for -b switch.

2005-04-30: 0.9.7 released

Applied #1181535 storable state patch.
Added -d flag which outputs a debugable parser.
Fixed generation of '#line' directives (according to ISO-C99).
Fixed bug #1187785 Re2c fails to generate valid code.
Fixed bug #1187452 unused variable `yyaccept'.

2005-04-14: 0.9.6 released

Fix build with gcc >= 3.4.

2005-04-08: 0.9.5 released

Added /*!max:re2c */ which emits a '#define YYMAXFILL <max>\n' line +This allows to define buffers of the minimum required length. Occurence must +follow '/*re2c */ and cannot preceed it.
Changed re2c to two pass generation to output warning free code.
Fixed bug #1163046 re2c hangs when processing valid re-file.
Fixed bug #1022799 re2c scanner has buffering bug.

2005-03-12: 0.9.4 released

Added --vernum support.
Fixed bug #1054496 incorrect code generated with -b option.
Fixed bug #1012748 re2c does not emit last line if '\n' missing.
Fixed bug #999104 --output=output option does not work as documented.
Fixed bug #999103 Invalid options prefixed with two dashes cause program +crash.

2004-05-26: 0.9.3 released

Fixes one small possible bug in the generated output. ych instead of yych +is output in certain circumstances

2004-05-26: 0.9.3 released

Fixes one small possible bug in the generated output. ych instead of yych +is output in certain circumstances.

2004-05-26: 0.9.2 released

Added -o option to specify the output file which also will set the #line +directives to something useful.
Print version to cout instead cerr.
Added -h and -- style options.
Moved development to http://sourceforge.net/projects/re2c
Fixed bug #960144 minor cosmetic problem.
Fixed bug #953181 cannot compile with.
Fixed bug #939277 Windows support.
Fixed bug #914462 automake build patch
Fixed bug #891940 braced quantifiers: {\d+(,|,\d+)?} style.
Fixed bug #869298 Add case insensitive string literals.
Fixed bug #869297 Input buffer overrun.

2003-12-13: 0.9.1 release

Removed rcs comments in source files.

2003-12-09: re2c adopted

Version 0.9.1 README

+
+Originally written by Peter Bumbulis (peter@csg.uwaterloo.ca)
+Currently maintained by Brian Young (bayoung@acm.org)
+
+The re2c distribution can be found at:
+
+http://www.tildeslash.org/re2c/index.html
+
+The source distribution is available from:
+
+http://www.tildeslash.org/re2c/re2c-0.9.1.tar.gz
+
+This distribution is a cleaned up version of the 0.5 release
+maintained by me (Brian Young). Several bugs were fixed as well
+as code cleanup for warning free compilation. It has been developed
+and tested with egcs 1.0.2 and gcc 2.7.2.3 on Linux x86. Peter
+Bumbulis' original release can be found at:
+
+ftp://csg.uwaterloo.ca/pub/peter/re2c.0.5.tar.gz
+
+re2c is a great tool for writing fast and flexible lexers. It has
+served many people well for many years and it deserves to be
+maintained more actively. re2c is on the order of 2-3 times faster
+than a flex based scanner, and its input model is much more
+flexible.
+
+Patches and requests for features will be entertained. Areas of
+particular interest to me are porting (a Solaris and an NT
+version will be forthcoming) and wide character support. Note
+that the code is already quite portable and should be buildable
+on any platform with minor makefile changes.
+

Version 0.5 Peter's original ANNOUNCE and README

+re2c is a tool for generating C-based recognizers from regular
+expressions. re2c-based scanners are efficient: for programming
+languages, given similar specifications, an re2c-based scanner is
+typically almost twice as fast as a flex-based scanner with little or no
+increase in size (possibly a decrease on cisc architectures). Indeed,
+re2c-based scanners are quite competitive with hand-crafted ones.
+
+Unlike flex, re2c does not generate complete scanners: the user must
+supply some interface code. While this code is not bulky (about 50-100
+lines for a flex-like scanner; see the man page and examples in the
+distribution) careful coding is required for efficiency (and
+correctness). One advantage of this arrangement is that the generated
+code is not tied to any particular input model. For example, re2c
+generated code can be used to scan data from a null-byte terminated
+buffer as illustrated below.
+
+Given the following source
+
+#define NULL ((char*) 0)
+char *scan(char *p){
+char *q;
+#define YYCTYPE char
+#define YYCURSOR p
+#define YYLIMIT p
+#define YYMARKER q
+#define YYFILL(n)
+/*!re2c
+[0-9]+ {return YYCURSOR;}
+[\000-\377] {return NULL;}
+*/
+}
+
+re2c will generate
+
+/* Generated by re2c on Sat Apr 16 11:40:58 1994 */
+#line 1 "simple.re"
+#define NULL ((char*) 0)
+char *scan(char *p){
+char *q;
+#define YYCTYPE char
+#define YYCURSOR p
+#define YYLIMIT p
+#define YYMARKER q
+#define YYFILL(n)
+{
+YYCTYPE yych;
+unsigned int yyaccept;
+goto yy0;
+yy1: ++YYCURSOR;
+yy0:
+if((YYLIMIT - YYCURSOR) < 2) YYFILL(2);
+yych = *YYCURSOR;
+if(yych <= '/') goto yy4;
+if(yych >= ':') goto yy4;
+yy2: yych = *++YYCURSOR;
+goto yy7;
+yy3:
+#line 10
+{return YYCURSOR;}
+yy4: yych = *++YYCURSOR;
+yy5:
+#line 11
+{return NULL;}
+yy6: ++YYCURSOR;
+if(YYLIMIT == YYCURSOR) YYFILL(1);
+yych = *YYCURSOR;
+yy7: if(yych <= '/') goto yy3;
+if(yych <= '9') goto yy6;
+goto yy3;
+}
+#line 12
+
+}
+
+Note that most compilers will perform dead-code elimination to remove
+all YYCURSOR, YYLIMIT comparisions.
+
+re2c was developed for a particular project (constructing a fast REXX
+scanner of all things!) and so while it has some rough edges, it should
+be quite usable. More information about re2c can be found in the
+(admittedly skimpy) man page; the algorithms and heuristics used are
+described in an upcoming LOPLAS article (included in the distribution).
+Probably the best way to find out more about re2c is to try the supplied
+examples. re2c is written in C++, and is currently being developed
+under Linux using gcc 2.5.8.
+
+Peter
+
+--
+
+re2c is distributed with no warranty whatever. The code is certain to
+contain errors. Neither the author nor any contributor takes
+responsibility for any consequences of its use.
+
+re2c is in the public domain. The data structures and algorithms used
+in re2c are all either taken from documents available to the general
+public or are inventions of the author. Programs generated by re2c may
+be distributed freely. re2c itself may be distributed freely, in source
+or binary, unchanged or modified. Distributors may charge whatever fees
+they can obtain for re2c.
+
+If you do make use of re2c, or incorporate it into a larger project an
+acknowledgement somewhere (documentation, research report, etc.) would
+be appreciated.
+
+Please send bug reports and feedback (including suggestions for
+improving the distribution) to
+
+Include a small example and the banner from parser.y with bug reports.
+
+peter@csg.uwaterloo.ca
+ + diff --git a/htdocs/manual.html b/htdocs/manual.html new file mode 100755 index 00000000..052d190c --- /dev/null +++ b/htdocs/manual.html @@ -0,0 +1,599 @@ + + + +Manpage of RE2C + + +

RE2C

+Section: User Commands (1)
+Updated: 26 June 2005
+Index +

+ +

NAME

re2c - convert regular expressions to C/C++

+ +

SYNOPSIS

re2c [-efsbvhd] [-o output] file

+ +

DESCRIPTION

re2c is a preprocessor that generates C-based recognizers from +regular expressions. The input to re2c consists of C/C++ source +interleaved with comments of the form /*!re2c ... */ which contain scanner +specifications. In the output these comments are replaced with code that, when +executed, will find the next input token and then execute some user-supplied +token-specific code.

For example, given the following code

+#define NULL            ((char*) 0)
+char *scan(char *p){
+char *q;
+#define YYCTYPE         char
+#define YYCURSOR        p
+#define YYLIMIT         p
+#define YYMARKER        q
+#define YYFILL(n)
+/*!re2c
+        [0-9]+          {return YYCURSOR;}
+        [\000-\377]     {return NULL;}
+*/
+}
+

+
+
+

re2c will generate

+/* Generated by re2c on Sat Apr 16 11:40:58 1994 */
+#line 1 "simple.re"
+#define NULL            ((char*) 0)
+char *scan(char *p){
+char *q;
+#define YYCTYPE         char
+#define YYCURSOR        p
+#define YYLIMIT         p
+#define YYMARKER        q
+#define YYFILL(n)
+{
+        YYCTYPE yych;
+        unsigned int yyaccept;
+        goto yy0;
+yy1:    ++YYCURSOR;
+yy0:
+        if((YYLIMIT - YYCURSOR) < 2) YYFILL(2);
+        yych = *YYCURSOR;
+        if(yych <= '/') goto yy4;
+        if(yych >= ':') goto yy4;
+yy2:    yych = *++YYCURSOR;
+        goto yy7;
+yy3:
+#line 10
+        {return YYCURSOR;}
+yy4:    yych = *++YYCURSOR;
+yy5:
+#line 11
+        {return NULL;}
+yy6:    ++YYCURSOR;
+        if(YYLIMIT == YYCURSOR) YYFILL(1);
+        yych = *YYCURSOR;
+yy7:    if(yych <= '/') goto yy3;
+        if(yych <= '9') goto yy6;
+        goto yy3;
+}
+#line 12
+
+}
+

+
+
+

After the /*!re2c */ blocks you can place a /*!max:re2c */ block that will +output a define that holds the maximum number of characters required to parse +the input. That is the maximum value YYFILL() will receive.

+ +

OPTIONS

re2c provides the following options:

-e: Cross-compile from an ASCII platform to an EBCDIC one.
-f: Generate a scanner with support for storable state.
-s: Generate nested ifs for some switches. Many compilers need this assist to +generate better code.
-b: Implies -s. Use bit vectors as well in the attempt to coax better +code out of the compiler. Most useful for specifications with more than a few +keywords (e.g. for most programming languages).
-d: Creates a parser that dumps information about the current position and in +which state the parser is while parsing the input. This is useful to debug +parser issues and states. If you use this switch you need to define a macro +YYDEBUG that is called like a function with two parameters: void +YYDEBUG(int state, char current). The first parameter receives the state or +-1 and the second parameter receives the input at the current cursor.
-h: -? Invoke a short help.
-v: Show version information.
-V: Show the version as a number XXYYZZ.
-o output: Specify the output file.

+
+
+ +

INTERFACE CODE

Unlike other scanner generators, re2c does not generate complete +scanners: the user must supply some interface code. In particular, the user +must define the following macros:

YYCTYPE: Type used to hold an input symbol. Usually char or unsigned char.
YYCURSOR: l-expression of type *YYCTYPE that points to the current input +symbol. The generated code advances YYCURSOR as symbols are matched. On entry, +YYCURSOR is assumed to point to the first character of the current token. On +exit, YYCURSOR will point to the first character of the following token.
YLIMIT: Expression of type *YYCTYPE that marks the end of the buffer (YLIMIT[-1] is +the last character in the buffer). The generated code repeatedly compares +YYCURSOR to YLIMIT to determine when the buffer needs (re)filling.
YYMARKER: l-expression of type *YYCTYPE. The generated code saves backtracking +information in YYMARKER.
YYFILL(n): The generated code "calls" YYFILL when the buffer needs (re)filling: at +least n additional characters should be provided. YYFILL should adjust +YYCURSOR, YYLIMIT and YYMARKER as needed. Note that for typical programming +languages n will be the length of the longest keyword plus one.
YYGETSTATE(): The user only needs to define this macro if the -f flag was +specified. In that case, the generated code "calls" YYGETSTATE at the very +beginning of the scanner in order to obtain the saved state. YYGETSTATE must +return a signed integer. The value must be either -1, indicating that the +scanner is entered for the first time, or a value previously saved by +YYSETSTATE. In the second case, the scanner will resume operations right after +where the last YYFILL was called.
YYSETSTATE(n): The user only needs to define this macro if the -f flag was +specified. In that case, the generated code "calls" YYSETSTATE just before +calling YYFILL. The parameter to YYSETSTATE is a signed integer that uniquely +identifies the specific instance of YYFILL that is about to be called. Should +the user wish to save the state of the scanner and have YYFILL return to the +caller, all he has to do is store that unique identifer in a variable. Later, +when the scannered is called again, it will call YYGETSTATE() and resume +execution right where it left off.

+
+
+ +

SCANNER WITH STORABLE STATES

When the -f flag is specified, re2c generates a scanner that can +store its current state, return to the caller, and later resume operations +exactly where it left off.

The default operation of re2c is a "pull" model, where the scanner asks for +extra input whenever it needs it. However, this mode of operation assumes that +the scanner is the "owner" the parsing loop, and that may not always be +convenient.

Typically, if there is a preprocessor ahead of the scanner in the stream, or +for that matter any other procedural source of data, the scanner cannot "ask" +for more data unless both scanner and source live in a separate threads.

The -f flag is useful for just this situation : it lets users design +scanners that work in a "push" model, i.e. where data is fed to the scanner +chunk by chunk. When the scanner runs out of data to consume, it just stores +its state, and return to the caller. When more input data is fed to the +scanner, it resumes operations exactly where it left off.

At this point, the -f option only works with "mono-block" re2c scanners: if +the scanner is described with more than one /*!re2c ... */ block, re2c -f fails +with an error.

Please see examples/push.re for push-model scanner.

+ +

SCANNER SPECIFICATIONS

Each scanner specification consists of a set of rules and name +definitions. Rules consist of a regular expression along with a block of C/C++ +code that is to be executed when the associated regular expression is matched. +Name definitions are of the form ``name = regular +expression;''.

+ +

SUMMARY OF RE2C REGULAR EXPRESSIONS

"foo": the literal string foo. ANSI-C escape sequences can be used.
'foo': the literal string foo (characters [a-zA-Z] treated case-insensitive). +ANSI-C escape sequences can be used.
[xyz]: a "character class"; in this case, the regular expression matches either an +'x', a 'y', or a 'z'.
[abj-oZ]: a "character class" with a range in it; matches an 'a', a 'b', any letter +from 'j' through 'o', or a 'Z'.
r\s: match any r which isn't an s. r and s must be +regular expressions which can be expressed as character classes.
r*: zero or more r's, where r is any regular expression
r+: one or more r's
r?: zero or one r's (that is, "an optional r")
name: the expansion of the "name" definition (see above)
(r): an r; parentheses are used to override precedence (see below)
rs: an r followed by an s ("concatenation")
r|s: either an r or an s
r/s: an r but only if it is followed by an s. The s is not part of +the matched text. This type of regular expression is called "trailing +context".
r{n}: matches r exactly n times.
r{n,}: matches r at least n times.
r{n,m}: matches r at least n but not more than m times.

+
+
+

The regular expressions listed above are grouped according to precedence, +from highest precedence at the top to lowest at the bottom. Those grouped +together have equal precedence.

+ +

A LARGER EXAMPLE

+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <string.h>
+
+#define ADDEQ   257
+#define ANDAND  258
+#define ANDEQ   259
+#define ARRAY   260
+#define ASM     261
+#define AUTO    262
+#define BREAK   263
+#define CASE    264
+#define CHAR    265
+#define CONST   266
+#define CONTINUE        267
+#define DECR    268
+#define DEFAULT 269
+#define DEREF   270
+#define DIVEQ   271
+#define DO      272
+#define DOUBLE  273
+#define ELLIPSIS        274
+#define ELSE    275
+#define ENUM    276
+#define EQL     277
+#define EXTERN  278
+#define FCON    279
+#define FLOAT   280
+#define FOR     281
+#define FUNCTION        282
+#define GEQ     283
+#define GOTO    284
+#define ICON    285
+#define ID      286
+#define IF      287
+#define INCR    288
+#define INT     289
+#define LEQ     290
+#define LONG    291
+#define LSHIFT  292
+#define LSHIFTEQ        293
+#define MODEQ   294
+#define MULEQ   295
+#define NEQ     296
+#define OREQ    297
+#define OROR    298
+#define POINTER 299
+#define REGISTER        300
+#define RETURN  301
+#define RSHIFT  302
+#define RSHIFTEQ        303
+#define SCON    304
+#define SHORT   305
+#define SIGNED  306
+#define SIZEOF  307
+#define STATIC  308
+#define STRUCT  309
+#define SUBEQ   310
+#define SWITCH  311
+#define TYPEDEF 312
+#define UNION   313
+#define UNSIGNED        314
+#define VOID    315
+#define VOLATILE        316
+#define WHILE   317
+#define XOREQ   318
+#define EOI     319
+
+typedef unsigned int uint;
+typedef unsigned char uchar;
+
+#define BSIZE   8192
+
+#define YYCTYPE         uchar
+#define YYCURSOR        cursor
+#define YYLIMIT         s->lim
+#define YYMARKER        s->ptr
+#define YYFILL(n)       {cursor = fill(s, cursor);}
+
+#define RET(i)  {s->cur = cursor; return i;}
+
+typedef struct Scanner {
+    int                 fd;
+    uchar               *bot, *tok, *ptr, *cur, *pos, *lim, *top, *eof;
+    uint                line;
+} Scanner;
+
+uchar *fill(Scanner *s, uchar *cursor){
+    if(!s->eof){
+        uint cnt = s->tok - s->bot;
+        if(cnt){
+            memcpy(s->bot, s->tok, s->lim - s->tok);
+            s->tok = s->bot;
+            s->ptr -= cnt;
+            cursor -= cnt;
+            s->pos -= cnt;
+            s->lim -= cnt;
+        }
+        if((s->top - s->lim) < BSIZE){
+            uchar *buf = (uchar*)
+                malloc(((s->lim - s->bot) + BSIZE)*sizeof(uchar));
+            memcpy(buf, s->tok, s->lim - s->tok);
+            s->tok = buf;
+            s->ptr = &buf[s->ptr - s->bot];
+            cursor = &buf[cursor - s->bot];
+            s->pos = &buf[s->pos - s->bot];
+            s->lim = &buf[s->lim - s->bot];
+            s->top = &s->lim[BSIZE];
+            free(s->bot);
+            s->bot = buf;
+        }
+        if((cnt = read(s->fd, (char*) s->lim, BSIZE)) != BSIZE){
+            s->eof = &s->lim[cnt]; *(s->eof)++ = '\n';
+        }
+        s->lim += cnt;
+    }
+    s->cur = cursor;
+    return cursor;
+}
+
+int scan(Scanner *s){
+        uchar *cursor = s->cur;
+std:
+        s->tok = cursor;
+/*!re2c
+any     = [\000-\377];
+O       = [0-7];
+D       = [0-9];
+L       = [a-zA-Z_];
+H       = [a-fA-F0-9];
+E       = [Ee] [+-]? D+;
+FS      = [fFlL];
+IS      = [uUlL]*;
+ESC     = [\\] ([abfnrtv?'"\\] | "x" H+ | O+);
+*/
+
+/*!re2c
+        "/*"                    { goto comment; }
+        
+        "auto"                  { RET(AUTO); }
+        "break"                 { RET(BREAK); }
+        "case"                  { RET(CASE); }
+        "char"                  { RET(CHAR); }
+        "const"                 { RET(CONST); }
+        "continue"              { RET(CONTINUE); }
+        "default"               { RET(DEFAULT); }
+        "do"                    { RET(DO); }
+        "double"                { RET(DOUBLE); }
+        "else"                  { RET(ELSE); }
+        "enum"                  { RET(ENUM); }
+        "extern"                { RET(EXTERN); }
+        "float"                 { RET(FLOAT); }
+        "for"                   { RET(FOR); }
+        "goto"                  { RET(GOTO); }
+        "if"                    { RET(IF); }
+        "int"                   { RET(INT); }
+        "long"                  { RET(LONG); }
+        "register"              { RET(REGISTER); }
+        "return"                { RET(RETURN); }
+        "short"                 { RET(SHORT); }
+        "signed"                { RET(SIGNED); }
+        "sizeof"                { RET(SIZEOF); }
+        "static"                { RET(STATIC); }
+        "struct"                { RET(STRUCT); }
+        "switch"                { RET(SWITCH); }
+        "typedef"               { RET(TYPEDEF); }
+        "union"                 { RET(UNION); }
+        "unsigned"              { RET(UNSIGNED); }
+        "void"                  { RET(VOID); }
+        "volatile"              { RET(VOLATILE); }
+        "while"                 { RET(WHILE); }
+        
+        L (L|D)*                { RET(ID); }
+        
+        ("0" [xX] H+ IS?) | ("0" D+ IS?) | (D+ IS?) |
+        (['] (ESC|any\[\n\\'])* ['])
+                                { RET(ICON); }
+        
+        (D+ E FS?) | (D* "." D+ E? FS?) | (D+ "." D* E? FS?)
+                                { RET(FCON); }
+        
+        (["] (ESC|any\[\n\\"])* ["])
+                                { RET(SCON); }
+        
+        "..."                   { RET(ELLIPSIS); }
+        ">>="                   { RET(RSHIFTEQ); }
+        "<<="                   { RET(LSHIFTEQ); }
+        "+="                    { RET(ADDEQ); }
+        "-="                    { RET(SUBEQ); }
+        "*="                    { RET(MULEQ); }
+        "/="                    { RET(DIVEQ); }
+        "%="                    { RET(MODEQ); }
+        "&="                    { RET(ANDEQ); }
+        "^="                    { RET(XOREQ); }
+        "|="                    { RET(OREQ); }
+        ">>"                    { RET(RSHIFT); }
+        "<<"                    { RET(LSHIFT); }
+        "++"                    { RET(INCR); }
+        "--"                    { RET(DECR); }
+        "->"                    { RET(DEREF); }
+        "&&"                    { RET(ANDAND); }
+        "||"                    { RET(OROR); }
+        "<="                    { RET(LEQ); }
+        ">="                    { RET(GEQ); }
+        "=="                    { RET(EQL); }
+        "!="                    { RET(NEQ); }
+        ";"                     { RET(';'); }
+        "{"                     { RET('{'); }
+        "}"                     { RET('}'); }
+        ","                     { RET(','); }
+        ":"                     { RET(':'); }
+        "="                     { RET('='); }
+        "("                     { RET('('); }
+        ")"                     { RET(')'); }
+        "["                     { RET('['); }
+        "]"                     { RET(']'); }
+        "."                     { RET('.'); }
+        "&"                     { RET('&'); }
+        "!"                     { RET('!'); }
+        "~"                     { RET('~'); }
+        "-"                     { RET('-'); }
+        "+"                     { RET('+'); }
+        "*"                     { RET('*'); }
+        "/"                     { RET('/'); }
+        "%"                     { RET('%'); }
+        "<"                     { RET('<'); }
+        ">"                     { RET('>'); }
+        "^"                     { RET('^'); }
+        "|"                     { RET('|'); }
+        "?"                     { RET('?'); }
+
+
+        [ \t\v\f]+           { goto std; }
+
+        "\n"
+            {
+                if(cursor == s->eof) RET(EOI);
+                s->pos = cursor; s->line++;
+                goto std;
+            }
+
+        any
+            {
+                printf("unexpected character: %c\n", *s->tok);
+                goto std;
+            }
+*/
+
+comment:
+/*!re2c
+        "*/"                    { goto std; }
+        "\n"
+            {
+                if(cursor == s->eof) RET(EOI);
+                s->tok = s->pos = cursor; s->line++;
+                goto comment;
+            }
+        any                     { goto comment; }
+*/
+}
+
+main(){
+    Scanner in;
+    int t;
+    memset((char*) &in, 0, sizeof(in));
+    in.fd = 0;
+    while((t = scan(&in)) != EOI){
+/*
+        printf("%d\t%.*s\n", t, in.cur - in.tok, in.tok);
+        printf("%d\n", t);
+*/
+    }
+    close(in.fd);
+}
+

+
+
+ +

FEATURES

re2c does not provide a default action: the generated code assumes +that the input will consist of a sequence of tokens. Typically this can be +dealt with by adding a rule such as the one for unexpected characters in the +example above.

The user must arrange for a sentinel token to appear at the end of input +(and provide a rule for matching it): re2c does not provide an +<<EOF>> expression. If the source is from a null-byte terminated +string, a rule matching a null character will suffice. If the source is from a +file then the approach taken in the example can be used: pad the input with a +newline (or some other character that can't appear within another token); upon +recognizing such a character check to see if it is the sentinel and act +accordingly.

re2c does not provide start conditions: use a separate scanner +specification for each start condition (as illustrated in the above +example).

+No [^x]. Use difference instead. +

BUGS

Only fixed length trailing context can be handled.

Difference only works for character sets.

The re2c internal algorithms need documentation.

+ +

AUTHORS

Peter Bumbulis <peter@csg.uwaterloo.ca>
Brian Young <bayoung@acm.org>
Dan Nuffer <nuffer@users.sourceforge.net>
Marcus Boerger <helly@users.sourceforge.net>
Hartmut Kaiser <hkaiser@users.sourceforge.net>
Emmanuel Mogenet <mgix@mgix.com> +(added storable state)

+
+
+ +

VERSION INFORMATION

This manpage describes re2c, version 0.9.8.

+ +

+
+
+

This document was created by man2html, using the manual pages.
+Time: 10:30:42 GMT, June 26, 2005

+ + diff --git a/re2c.sln b/re2c.sln new file mode 100755 index 00000000..2fe56782 --- /dev/null +++ b/re2c.sln @@ -0,0 +1,21 @@ +Microsoft Visual Studio Solution File, Format Version 7.00 +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "re2c", "re2c.vcproj", "{18C5E289-8D90-400D-9F80-766F174CEDC9}" +EndProject +Global + GlobalSection(SolutionConfiguration) = preSolution + ConfigName.0 = Debug + ConfigName.1 = Release + EndGlobalSection + GlobalSection(ProjectDependencies) = postSolution + EndGlobalSection + GlobalSection(ProjectConfiguration) = postSolution + {18C5E289-8D90-400D-9F80-766F174CEDC9}.Debug.ActiveCfg = Debug|Win32 + {18C5E289-8D90-400D-9F80-766F174CEDC9}.Debug.Build.0 = Debug|Win32 + {18C5E289-8D90-400D-9F80-766F174CEDC9}.Release.ActiveCfg = Release|Win32 + {18C5E289-8D90-400D-9F80-766F174CEDC9}.Release.Build.0 = Release|Win32 + EndGlobalSection + GlobalSection(ExtensibilityGlobals) = postSolution + EndGlobalSection + GlobalSection(ExtensibilityAddIns) = postSolution + EndGlobalSection +EndGlobal diff --git a/re2c.vcproj b/re2c.vcproj new file mode 100755 index 00000000..883eeba0 --- /dev/null +++ b/re2c.vcproj @@ -0,0 +1,155 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

re2c

Changelog

2005-06-26: 0.9.8 released

2005-04-30: 0.9.7 released

2005-04-14: 0.9.6 released

2005-04-08: 0.9.5 released

2005-03-12: 0.9.4 released

2004-05-26: 0.9.3 released

2004-05-26: 0.9.3 released

2004-05-26: 0.9.2 released

2003-12-13: 0.9.1 release

2003-12-09: re2c adopted

Version 0.9.1 README

Version 0.5 Peter's original ANNOUNCE and README

RE2C

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

INTERFACE CODE

SCANNER WITH STORABLE STATES

SCANNER SPECIFICATIONS

SUMMARY OF RE2C REGULAR EXPRESSIONS

A LARGER EXAMPLE

FEATURES

BUGS

SEE ALSO

AUTHORS

VERSION INFORMATION

Index