From: helly Date: Sat, 18 Feb 2006 23:21:24 +0000 (+0000) Subject: - Update X-Git-Tag: 0.13.6~454 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=0e6792928f14995afed326448043f0b05f5a94b5;p=re2c - Update --- diff --git a/htdocs/manual.html b/htdocs/manual.html index 48c6bfc6..69b7448d 100755 --- a/htdocs/manual.html +++ b/htdocs/manual.html @@ -87,19 +87,14 @@ yy7: if(yych <= '/') goto yy3;

After the /*!re2c */ blocks you can place a /*!max:re2c */ block that will -output a define that holds the maximum number of characters required to parse +output a define (YYMAXFILL) that holds the maximum number of characters required to parse the input. That is the maximum value YYFILL() will receive.

OPTIONS

re2c provides the following options:

-
-e
-
Cross-compile from an ASCII platform to an EBCDIC one.

-
-f
-
Generate a scanner with support for storable state.

-
-s
-
Generate nested ifs for some switches. Many compilers need this assist to -generate better code.

+
-?
+
-h Invoke a short help.

-b
Implies -s. Use bit vectors as well in the attempt to coax better code out of the compiler. Most useful for specifications with more than a few @@ -111,18 +106,26 @@ parser issues and states. If you use this switch you need to define a macro YYDEBUG that is called like a function with two parameters: void YYDEBUG(int state, char current). The first parameter receives the state or -1 and the second parameter receives the input at the current cursor.

+
-e
+
Cross-compile from an ASCII platform to an EBCDIC one.

+
-f
+
Generate a scanner with support for storable state. For details see below +at SCANNER WITH STORABLE STATES.

-i
Do not output #line information. This is usefull when you want use a CMS tool with the re2c output which you might want if you do not require your users -to have re2c themselves when building from your source.

-
-h
-
-? Invoke a short help.

+to have re2c themselves when building from your source. -o output +Specify the output file.

+
-s
+
Generate nested ifs for some switches. Many compilers need this assist to +generate better code.

-v
Show version information.

-V
Show the version as a number XXYYZZ.

-
-o output
-
Specify the output file.

+
-w
+
Create a parser that supports wide chars (UCS-2). This implies -s +and cannot be used together with -e switch.



@@ -200,14 +203,48 @@ scanner, it resumes operations exactly where it left off.

At this point, the -f option only works with "mono-block" re2c scanners: if the scanner is described with more than one /*!re2c ... */ block, re2c -f fails with an error.

+

Changes needed compared to the "pull" model.

+

1. User has to supply macros YYSETSTATE() YYGETSTATE(state)

+

2. The -f option inhibits declaration of yych and +yyaccept. So the user has to declare these. Also the user has to save +and restore these. In the example examples/push.re these are declared as +fields of the (C++) class of which the scanner is a method, so they do not need +to be saved/restored explicitly. For C they could e.g. be made macros that +select fields from a structure passed in as parameter. Alternatively, they +could be declared as local variables, saved with YYFILL(n) when it decides +to return and restored at entry to the function. Also, it could be more +efficient to save the state from YYFILL(n) because +YYSETSTATE(state) is called unconditionally. YYFILL(n) however does not +get state as parameter, so we would have to store state in a local +variable by YYSETSTATE(state).

+

3. Modify YYFILL(n) to return (from +the function calling it) if more input is needed.

+

4. Modify caller to recognise "more input is needed" and respond +appropriately.

Please see examples/push.re for push-model scanner.

SCANNER SPECIFICATIONS

-

Each scanner specification consists of a set of rules and name -definitions. Rules consist of a regular expression along with a block of C/C++ -code that is to be executed when the associated regular expression is matched. -Name definitions are of the form ``name = regular -expression;''.

+

Each scanner specification consists of a set of rules, name +definitions and configurations.

+

Rules consist of a regular expression along with a block of C/C++ +code that is to be executed when the associated regular expression is +matched.

+
+
regular expression { C/C++ code }
+
+

Name definitions are of the form:

+
+
name = regular expression;
+
+

Configurations look like name definitions whose names start with +"re2c:":

+
+
re2c:name = value;
+

SUMMARY OF RE2C REGULAR EXPRESSIONS

@@ -242,7 +279,7 @@ regular expressions which can be expressed as character classes.
r|s
either an r or an s
r/s
-
an r but only if it is followed by an s. The s is not part of +
an r but only if it is followed by an s. The s is not part of the matched text. This type of regular expression is called "trailing context".
r{n}
@@ -259,14 +296,48 @@ context".

Character classes and string literals may contain octoal or hexadecimal -character definitions and the following set of escape sequences (\n,
- \t, \v, \b, \r, \f, \a, \\). An octal character is defined by a backslash -followed by its three octal digits and a hexadecimal character is defined by -backslash, a lower cased 'x' and its two hexadecimal digits.

+character definitions and the following set of escape sequences +(\n,
+ \t, \v, \b, \r, \f, \a, \\). +An octal character is defined by a backslash followed by its three octal digits +and a hexadecimal character is defined by backslash, a lower cased 'x' +and its two hexadecimal digits or a backslash, an upper cased X and its +four hexadecimal digits.

+

re2c further more supports the c/c++ unicode notation. That is a backslash +followed by either a lowercased u and its four hexadecimal digits or an +uppercased U and its eight hexadecimal digits. However using the U +notation it is not possible to support characters greater \U0000FFFF due +to an internal limitation of re2c.

+

Since characters greater \X00FF are not allowed in non unicode mode, +the only portable "any" rules are (.|"\n") and [^].

The regular expressions listed above are grouped according to precedence, from highest precedence at the top to lowest at the bottom. Those grouped together have equal precedence.

+

INPLACE CONFIGURATION

+

It is possible to configure code generation inside re2c blocks. The +following lists the available configurations:

+
+
re2c:indent:top = 0 ;
+
Specifies the minimum number of indendation to use. Requires a numeric +value greater than or equal zero.
+
re2c:indent:string = "\t" ;
+
Specifies the string to use for indendation. Requires a string that should +contain only whitespace unless you need this for external tools. The easiest +way to specify spaces is to enclude them in single or double quotes. If you do +not want any indendation at all you can simply set this to "".
+
re2c:yybm:hex = 0 ;
+
If set to zero then a decimal table is being used else a hexadecimal table +will be generated.
+
re2c:startlabel = 0 ;
+
If set to a non zero integer then the start label of the next scanner +blocks will be generated even if not used by the scanner itself. Otherwise the +normal yy0 like start label is only being generated if needed. If set to +a text value then a label with that text will be generated regardless of +whether the normal start label is being used or not. This setting is being +reset to 0 after a start label has been generated.
+
+

A LARGER EXAMPLE

 #include <stdlib.h>
@@ -593,7 +664,7 @@ specification for each start condition (as illustrated in the above example).
 

VERSION INFORMATION

-

This manpage describes re2c, version 0.9.10.

+

This manpage describes re2c, version 0.10.0.


Index

@@ -617,6 +688,6 @@ specification for each start condition (as illustrated in the above example).

This document was created by man2html, using the manual pages.
-Time: 11:33:44 GMT, September 04, 2005

+Time: 23:55:44 GMT, Februar 18, 2006