From: helly Date: Sun, 22 Apr 2007 19:51:00 +0000 (+0000) Subject: - Update docu X-Git-Tag: 0.13.6~184 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=43a72447ae2082a3cd3f4a71db8b68f507593931;p=re2c - Update docu --- diff --git a/re2c/CHANGELOG b/re2c/CHANGELOG index 6921f9be..f41d3e41 100644 --- a/re2c/CHANGELOG +++ b/re2c/CHANGELOG @@ -1,5 +1,6 @@ Version 0.13.0 (2007-??-??) --------------------------- +- Add -c and -t to generate scanners with (f)lex-like condition support. - Fix issue with short form of switches and parameter if not first switch. Version 0.12.0 (2007-??-??) diff --git a/re2c/htdocs/index.html b/re2c/htdocs/index.html index b027ce9b..8cc2339b 100755 --- a/re2c/htdocs/index.html +++ b/re2c/htdocs/index.html @@ -81,6 +81,7 @@ fixes which were incorporated. Changelog

2007-??-??: 0.13.0

2007-??-??: 0.12.0

diff --git a/re2c/htdocs/manual.html b/re2c/htdocs/manual.html index 59dc99e7..fa04370d 100755 --- a/re2c/htdocs/manual.html +++ b/re2c/htdocs/manual.html @@ -7,7 +7,7 @@

RE2C

Section: User Commands (1)
-Updated: 01 Apr 2007
+Updated: 22 Apr 2007
Index
  @@ -15,7 +15,7 @@ Updated: 01 Apr 2007

re2c - convert regular expressions to C/C++

 

SYNOPSIS

-

re2c [-bdefghisuvVw1] [-o output] file

+

re2c [-bdefghisuvVw1] [-o output] [-c [-t header]] file

 

DESCRIPTION

re2c is a preprocessor that generates C-based recognizers from @@ -91,6 +91,8 @@ scanner code and will not be part of the output.

Implies -s. Use bit vectors as well in the attempt to coax better code out of the compiler. Most useful for specifications with more than a few keywords (e.g. for most programming languages).

+
-c
+
Used to support (f)lex-like condition support.

-d
Creates a parser that dumps information about the current position and in which state the parser is while parsing the input. This is useful to debug @@ -98,9 +100,6 @@ parser issues and states. If you use this switch you need to define a macro YYDEBUG that is called like a function with two parameters: void YYDEBUG(int state, char current). The first parameter receives the state or -1 and the second parameter receives the input at the current cursor.

-
--no-generation-date
-
Suppress date output in the generated output so that it only shows the re2c -version.
-e
Cross-compile from an ASCII platform to an EBCDIC one.

-f
@@ -121,6 +120,9 @@ Specify the output file.

-s
Generate nested ifs for some switches. Many compilers need this assist to generate better code.

+
-t
+
Create a header file that contains types for the (f)lex-like condition support. +This can only be activated when -c is in use.

-u
Generate a parser that supports Unicode chars (UTF-32). This means the generated code can deal with any valid Unicode character up to 0x10FFFF. When @@ -136,6 +138,9 @@ and cannot be used together with -e switch.

-1
Force single pass generation, this cannot be combined with -f and disables YYMAXFILL generation prior to last re2c block.

+
--no-generation-date
+
Suppress date output in the generated output so that it only shows the re2c +version.



@@ -146,6 +151,20 @@ scanners: the user must supply some interface code. In particular, the user must define the following macros or use the corresponding inplace configurations:

+
YYCONDITION
+
+This variable holdes the condition prior to entering the scanner code when using +-c switch. The value must be initialized with a value from the enumeration +YYCONDTYPE type.
+
YYCONDTYPE
+
In -c mode you can use -t to generate a file that contains the +enumeration used as conditions. Each of the values refers to a condition of +a rule set.
+
YYCTXMARKER
+
l-expression of type *YYCTYPE. The generated code saves context +backtracking information in YYCTXMARKER. The user only needs to define this +macro if a scanner specification uses trailing context in one or more of its +regular expressions.

YYCTYPE
Type used to hold an input symbol. Usually char or unsigned char.

YYCURSOR
@@ -153,18 +172,12 @@ corresponding inplace configurations:

symbol. The generated code advances YYCURSOR as symbols are matched. On entry, YYCURSOR is assumed to point to the first character of the current token. On exit, YYCURSOR will point to the first character of the following token.

-
YYLIMIT
-
Expression of type *YYCTYPE that marks the end of the buffer (YYLIMIT[-1] is -the last character in the buffer). The generated code repeatedly compares -YYCURSOR to YYLIMIT to determine when the buffer needs (re)filling.

-
YYMARKER
-
l-expression of type *YYCTYPE. The generated code saves backtracking -information in YYMARKER. Some easy scanners might not use this.

-
YYCTXMARKER
-
l-expression of type *YYCTYPE. The generated code saves context -backtracking information in YYCTXMARKER. The user only needs to define this -macro if a scanner specification uses trailing context in one or more of its -regular expressions.

+
YYDEBUG(state,current)
+
This is only needed if the -d flag was specified. It allows to +easily debug the generated parser by calling a user defined function for every +state. The function should have the following signature: void YYDEBUG(int +state, char current). The first parameter receives the state or -1 and the +second parameter receives the input at the current cursor.

YYFILL(n)
The generated code "calls" YYFILL(n) when the buffer needs (re)filling: at least n additional characters should be provided. YYFILL(n) should adjust @@ -182,6 +195,16 @@ return a signed integer. The value must be either -1, indicating that the scanner is entered for the first time, or a value previously saved by YYSETSTATE(s). In the second case, the scanner will resume operations right after where the last YYFILL(n) was called.

+
YYLIMIT
+
Expression of type *YYCTYPE that marks the end of the buffer (YYLIMIT[-1] is +the last character in the buffer). The generated code repeatedly compares +YYCURSOR to YYLIMIT to determine when the buffer needs (re)filling.

+
YYMARKER
+
l-expression of type *YYCTYPE. The generated code saves backtracking +information in YYMARKER. Some easy scanners might not use this.

+
YYMAXFILL
+
This will be automatically defined by /*!max:re2c */ blocks as explained +above.

YYSETSTATE(s)
The user only needs to define this macro if the -f flag was specified. In that case, the generated code "calls" YYSETSTATE just before @@ -193,15 +216,6 @@ when the scanner is called again, it will call YYGETSTATE() and resume execution right where it left off. The generated code will contain both YYSETSTATE(s) and YYGETSTATE() even if YYFILL(n) is being disabled.

-
YYDEBUG(state,current)
-
This is only needed if the -d flag was specified. It allows to -easily debug the generated parser by calling a user defined function for every -state. The function should have the following signature: void YYDEBUG(int -state, char current). The first parameter receives the state or -1 and the -second parameter receives the input at the current cursor.

-
YYMAXFILL
-
This will be automatically defined by /*!max:re2c */ blocks as explained -above.



@@ -253,6 +267,20 @@ the scanner code should be wrapped inside a loop.

Please see examples/push.re for push-model scanner. The generated code can be tweaked using inplace configurations "state:abort" and "state:nextlabel".

+

SCANNER WITH CONDITION SUPPORT

+

+You can preceed regular-expressions with a list of condition names when using the -c +switch. In this case re2c generates scanners for each conditon. Each of the +generated scanners has its own precondition. The precondition is given by the +interface variable YYCONDITON and must be of type YYCONDTYPE. +

+There are two special conditons. First the rules of the condition '*' are +merged to all conditions. And second the empty condition list allows to +provide a code block that does not have a scanner part. Meaning it does not +allow any regular expression. The condition value referring to this special +block is always the one with the enumeration value 0. +

+

SCANNER SPECIFICATIONS

Each scanner specification consists of a set of rules, named definitions and configurations.

@@ -260,7 +288,18 @@ definitions and configurations.

code that is to be executed when the associated regular expression is matched.

-
regular expression { C/C++ code }
+
regular-expression { C/C++ code }
+
+

+If -c is active then each regular expression is preceeded by a list of +comma separated condition names. Besides normal naming rules there are two +special cases. A rule may contain the single condition name '*' and no contition +name at all. In the latter case the rule cannot have a regular expression. +

+
+
<condition-list> regular-expression { C/C++ code }
+
<*> regular-expression { C/C++ code }
+
<> { C/C++ code }

Named definitions are of the form:

@@ -272,7 +311,7 @@ matched.

re2c:name = value;
re2c:name = "value";
- +

SUMMARY OF RE2C REGULAR EXPRESSIONS

"foo"
@@ -340,7 +379,7 @@ the only portable "any" rules are (.|"\n") and [^].

The regular expressions listed above are grouped according to precedence, from highest precedence at the top to lowest at the bottom. Those grouped together have equal precedence.

- +

INPLACE CONFIGURATION

It is possible to configure code generation inside re2c blocks. The following lists the available configurations:

@@ -401,6 +440,10 @@ and changes to that configuration are no longer possible. When this setting is a string the braces must be specified. Now assuming your input is a char* buffer and you are using above mentioned switches you can set YYCTYPE to unsigned char and this setting to either 1 or "(unsigned char)". +
re2c:define:define:YYCONDITION = YYCONDITION ;
+
Variable or function of type YYCONDTYPE used in -c mode.
+
re2c:define:define:YYCONDTYPE = YYCONDTYPE ;
+
Enumeration used for condition support with -c mode.
re2c:define:YYCTXMARKER = YYCTXMARKER ;
Allows to overwrite the define YYCTXMARKER and thus avoiding it by setting the value to the actual code needed.
@@ -438,15 +481,18 @@ value to the actual code needed.
Allows to overwrite the name of the variable yybm.
re2c:variable:yych = yych ;
Allows to overwrite the name of the variable yych.
+
re2c:variable:yyctable = yyctable ;
+
When both -c and -g are active then re2c uses this variable to +generate a static jump table for conditions.
re2c:variable:yytarget = yytarget ;
Allows to overwrite the name of the variable yytarget.
- +

UNDERSTANDING RE2C

The subdirectory lessons of the re2c distribution contains a few step by step lessons to get you started with re2c. All examples in the lessons subdirectory can be compiled and actually work.

- +

FEATURES

re2c does not provide a default action: the generated code assumes that the input will consist of a sequence of tokens. Typically this can be @@ -463,16 +509,16 @@ to end the scanner in case not enough characters are available which is nothing else then e detection of end of data/file.

re2c does not provide start conditions: use a separate scanner specification for each start condition (as illustrated in the above example). -

+

BUGS

Difference only works for character sets.

The re2c internal algorithms need documentation.

- +

SEE ALSO

flex(1), lex(1). More information on re2c can be found here: http://re2c.org/

- +

AUTHORS