flex \- fast lexical analyzer generator
.SH SYNOPSIS
.B flex
-.B [\-bcdfhilnpstvwBFILTV78+? \-C[aefFmr] \-ooutput \-Pprefix \-Sskeleton]
+.B [\-bcdfhilnpstvwBFILTV78+? \-R[b] \-C[aefFmr] \-ooutput \-Pprefix \-Sskeleton]
.B [\-\-help \-\-version]
.I [filename ...]
.SH OVERVIEW
the (experimental) facility for generating C++
scanner classes
+ Reentrant C Scanners
+ how to generate reentrant C scanners
+
+ Reentrant C Scanners With Bison Pure Parsers
+ connecting a reentrant scanner to a bison pure parser
+
+ Functions And Macros Available In Reentrant C Scanners
+ a summary of functions and macros for use within reentrant scanners
+
Incompatibilities With Lex And POSIX
how flex differs from AT&T lex and the POSIX lex
standard
fault -- you should report these sorts of errors to the email address
given below).
.TP
+.B \-R[b]
+instructs flex to generate a reentrant C scanner.
+The generated scanner may safely be used in a multi-threaded
+environment. The API for a reentrant scanner is different
+than for a non-reentrant scanner (see
+.B \"Reentrant C Scanners\"
+below).
+Because of the API difference between reentrant and non-reentrant
+.I flex
+scanners, non-reentrant flex code must be modified before
+it is suitable for use with this option.
+This option is not compatible with the
+.B \-+
+option.
+.IP
+.B \-Rb
+(reentrant bison) instructs flex to generate a reentrant C scanner that is
+meant to be called by a
+.I GNU bison
+pure parser. The scanner is the same as the scanner generated by the
+.B -R
+option, but with minor API changes for
+.I bison
+compatiblity. In particular, the declaration of
+.I yylex
+is modified, and support for
+.I yylval_r
+and
+.I yylloc_r
+is incorporated. See
+.B \"Reentrant C Scanners With Bison Pure Parsers\" below.
+.IP
+The options
+.B -R
+and
+.B \-Rb
+do not affect the performance of the scanner.
+\.
+.TP
.B \-T
makes
.I flex
meta-ecs -Cm option
perf-report -p option
read -Cr option
+ reentrant -R option
+ rentrant-bison -Rb option
stdout -t option
verbose -v option
warn opposite of -w option
.I stdin
and
.I stdout
-to be compile-time constant.
+to be compile-time constant. In a reentrant scanner, however, this is not
+a problem since initialization is performed in
+.B yylex_init
+at runtime.
.TP
.B yylineno
directs
.B yylineno.
This option is implied by
.B %option lex-compat.
+In a reentrant C scanner, the macro
+.B yylineno_r
+is accessible regardless of the value of
+.B %option yylineno,
+however, its value is not modified by flex unless
+.B %option yylineno
+is enabled.
.TP
.B yywrap
if unset (i.e.,
yy_push_state, yy_pop_state, yy_top_state
yy_scan_buffer, yy_scan_bytes, yy_scan_string
+ yyget_extra, yyset_extra, yyget_leng, yyget_text,
+ yyget_lineno, yyset_lineno, yyget_in, yyset_in,
+ yyget_out, yyset_out, yyget_lval, yyset_lval,
+ yyget_lloc, yyset_lloc,
+
.fi
(though
.B yy_push_state()
IMPORTANT: the present form of the scanning class is
.I experimental
and may change considerably between major releases.
+.SH REENTRANT C SCANNERS
+.PP
+Flex has the ability to generate a reentrant C scanner. This is
+accomplished by specifying
+.BR " %option reentrant " ( "-R" ") or " "%option reentrant-bison " ( "-Rb" ).
+The generated scanner is both portable, and safe to use in one or more separate threads of control.
+The most common use for reentrant scanners is from within multi-threaded applications.
+Any thread may create and execute a reentrant
+.B flex
+scanner without the need for synchronization with other threads.
+.PP
+However, there are other uses for a reentrant scanner.
+For example, you could scan two or more files simultaneously to implement a 'diff'
+at the token level (i.e., instead of at the character level):
+.nf
+
+ /* Example of maintaining more than one active scanner. */
+
+ do {
+ int tok1, tok2;
+
+ tok1 = yylex( scanner_1 );
+ tok2 = yylex( scanner_2 );
+
+ if( tok1 != tok2 )
+ printf("Files are different.");
+
+ } while ( tok1 && tok2 );
+
+.fi
+.PP
+Another use for a reentrant scanner is recursion.
+(Note that a recursive scanner can also be created using a non-reentrant scanner and
+buffer states. See
+.BR "Multiple Input Buffers" ,
+above.)
+The following crude scanner supports the "eval" command by invoking another
+instance of itself.
+.nf
+
+ /* Example of recursive invocation. */
+
+ %option reentrant
+
+ %%
+ "eval(".+")" {
+ void * scanner;
+ yylex_init( &scanner );
+ yytext_r[yyleng_r-1] = '\0';
+
+ yyscan_string( yytext_r + 5, scanner );
+ yylex( scanner );
+
+ yylex_destroy( scanner );
+ }
+ ...
+ %%
+
+.fi
+.PP
+The API for reentrant scanners is different than for
+non-reentrant scanners. Here is a quick overview of the API:
+.TP
+.B 1.
+.B %option reentrant
+must be specified.
+.TP
+.B 2.
+All functions take one additional argument:
+.I yy_globals
+\.
+.TP
+.B 3.
+All global variables are replaced by their "_r" equivalents.
+.TP
+.B 4.
+.B yylex_init
+and
+.B yylex_destroy
+must be called before and after
+.B yylex
+, respectively.
+.TP
+.B 5.
+Accessor methods (get/set functions) provide access to common flex variables.
+.TP
+.B 6.
+User-specific data can be stored in
+.IR yyextra_r .
+.PP
+The above list is explained in detail below. First, an example of a reentrant scanner:
+.nf
+
+ /* This scanner prints "//" comments. */
+
+ %option reentrant stack
+ %x COMMENT
+
+ %%
+ "//" yy_push_state( COMMENT, yy_globals);
+ .|\\n
+
+ <COMMENT>\\n yy_pop_state( yy_globals );
+
+ <COMMENT>[^\\n]+ fprintf( yyout_r, "%s\\n", yytext_r);
+
+ %%
+ int main ( int argc, char * argv[] )
+ {
+ void * scanner;
+
+ yylex_init ( &scanner );
+ yylex ( scanner );
+ yylex_destroy ( scanner );
+
+ return 0;
+ }
+
+.fi
+.br
+.PP
+.B 1. %option reentrant
+must be specified.
+.RS
+.PP
+Notice that
+.B %option reentrant
+is specified in the above example. Had this option not been
+specified, flex would have happily generated a non-reentrant scanner without
+complaining. You may explicitly specify
+.BR "%option noreentrant" ,
+if you do
+.I not
+want a reentrant scanner, although it is not necessary. The default is to
+generate a non-reentrant scanner.
+.RE
+.PP
+.B 2.
+All functions take one additional argument:
+.IR yy_globals .
+.RS
+.PP
+Notice that the calls to
+.B yy_push_state
+and
+.B yy_pop_state
+both have an argument,
+.I yy_globals
+, that is not present in a non-reentrant scanner.
+Here are the declarations of
+.B yy_push_state
+and
+.B yy_pop_state
+in the generated scanner.
+.nf
+
+ static void yy_push_state ( int new_state , void * yy_globals ) ;
+ static void yy_pop_state ( void * yy_globals ) ;
+.fi
+.PP
+Notice that the argument
+.I yy_globals
+appears in the declaration of both functions.
+In fact, all flex functions in a reentrant scanner have this additional argument.
+It is always the last argument in the argument list, it always of type
+.BR "void *" ,
+and it is always named
+.IR yy_globals .
+As you may have guessed,
+.I yy_globals
+is a pointer to an opaque data structure encapsulating the current state of the
+scanner.
+For a list of function declarations, see "Functions And Macros Available In
+Reentrant Scanners" below.
+.PP
+Note that preprocessor macros, such as
+.BR BEGIN ,
+.BR ECHO ", and " REJECT ,
+do not take this additional argument.
+.RE
+.PP
+.B 3.
+All global variables are replaced by their
+.B _r
+equivalents.
+.RS
+.PP
+Notice in the above example that
+.B yyout
+and
+.B yytext
+are replaced by
+.B yyout_r
+and
+.BR yytext_r .
+These
+are macros that will expand to their equivalent lvalue.
+All of the familiar flex globals have been replaced by their
+.B _r
+equivalents. Wherever you would
+normally use
+.B yytext
+in actions, you must use
+.B yytext_r
+instead. This rule
+applies to all flex variables. The following is an example that uses the
+.B _r
+macros:
+.nf
+
+ %%
+ #define SWAP(a,b) do{int t=a; a=b; b=t;}while(0)
+
+ "reverse me" {
+ int i;
+
+ for( i =0; i < yyleng_r/2 ; i++ )
+ SWAP( yytext_r[i], yytext_r[yyleng_r-i-1] );
+
+ fprintf( yyout_r, "%s", yytext_r );
+ }
+
+.fi
+.PP
+One important thing to remember about
+.B yytext_r
+and friends is that
+.B yytext_r
+is not a global variable in a reentrant
+scanner, you can not access it directly from outside an action or from
+other functions. You must use the accessor method
+.B yyget_text
+to accomplish this. (See below).
+.RE
+.PP
+.B 4.
+.BR yylex_init " and " yylex_destroy " must be called before and after "
+.BR yylex ", respectively."
+.RS
+.nf
+
+ int yylex_init ( void ** ptr_yy_globals ) ;
+ int yylex ( void * yy_globals ) ;
+ int yylex_destroy ( void * yy_globals ) ;
+
+.fi
+.PP
+The function
+.B yylex_init
+must be called before calling any other function. The
+argument to
+.B yylex_init
+is the address of an uninitialized pointer to be filled
+in by flex. The contents of
+.I ptr_yy_globals
+need not be initialized, since flex
+will overwrite it anyway. The value stored in
+.I ptr_yy_globals
+should thereafter be passed
+to
+.B yylex
+and
+.BR yylex_destroy .
+Flex does not save the argument passed to
+.BR yylex_init ,
+so it is safe to pass the address of a local pointer to
+.BR yylex_init .
+.PP
+The function
+.B yylex
+should be familiar to you by now. The reentrant version
+takes one argument, which is the value returned (via an argument) by
+.BR yylex_init .
+Otherwise, it behaves the same as the non-reentrant version of
+.BR yylex .
+.PP
+The function
+.B yylex_destroy
+should be called to free resources used by the
+scanner. After
+.B yylex_destroy
+is called, the contents of
+.I yyglobals
+should not be
+used.
+.PP
+Of course, there is no need to destroy a scanner if you plan to reuse it.
+A flex scanner (both reentrant and non-reentrant) may be restarted by calling
+.BR yyrestart .
+.RE
+.PP
+.B 5.
+Accessor methods (get/set functions) provide access to common flex variables.
+.RS
+.PP
+Many scanners that you build will be part of a larger project. Portions of your
+project will need access to flex values, such as
+.BR yytext .
+In a non-reentrant
+scanner, these values are global, so there is no problem. However, in a
+reentrant scanner, there are no global flex values. You can not access them
+directly.
+.PP
+Instead, you must access flex values using accessor methods (get/set
+functions). Each accessor method is named
+.B yyget_NAME
+or
+.BR yyset_NAME ,
+where
+.B NAME
+is the name of the flex variable you want. For example:
+.nf
+
+ /* Set the last character of yytext to NULL. */
+
+ void chop ( void * scanner )
+ {
+ int len = yyget_leng( scanner );
+ yyget_text( scanner )[len - 1] = '\\0';
+ }
+.fi
+.PP
+The above code may be called from within an action like this:
+.nf
+
+ %%
+ .+\\n { chop( yy_globals ); }
+
+.fi
+.RE
+.PP
+.B 6.
+User-specific data can be stored in
+.IR yyextra_r .
+.RS
+.PP
+In a reentrant scanner, it is unwise to use global variables to
+communicate with or maintain state between different pieces of your program.
+However, you may need access to external data or invoke external functions
+from within the scanner actions.
+Likewise, you may need to pass information to your scanner
+(e.g., open file descriptors, or database connections).
+In a non-reentrant scanner, the only way to do this would be through the
+use of global variables.
+.PP
+Flex allows you to store arbitrary, "extra" data in a scanner.
+This data is accessible through the accessor methods
+.B yyget_extra
+and
+.B yyset_extra
+from outside the scanner, and through the shortcut macro
+.B yyextra_r
+from within the scanner itself. They are defined as
+.nf
+
+ #define YY_EXTRA_TYPE void*
+
+ YY_EXTRA_TYPE yyget_extra ( void * scanner ) ;
+ void yyset_extra ( YY_EXTRA_TYPE arbitrary_data , void * scanner) ;
+
+.fi
+.PP
+By default,
+.B YY_EXTRA_TYPE
+is defined as type
+.BR "void *" .
+You will have to cast
+.B yyextra_r
+and the return value from
+.B yyget_extra
+to the appropriate value each time you access the extra data.
+To avoid casting, you may override the default type by defining
+.B YY_EXTRA_TYPE
+in section 1 of your scanner:
+.nf
+
+ /* An example of overriding YY_EXTRA_TYPE. */
+
+ %{
+ #include <sys/stat.h>
+ #include <unistd.h>
+
+ #define YY_EXTRA_TYPE struct stat*
+ %}
+
+ %option reentrant
+
+ %%
+
+ __filesize__ printf( "%ld", yyextra_r->st_size );
+ __lastmod__ printf( "%ld", yyextra_r->st_mtime );
+
+ %%
+
+ void scan_file( char* filename )
+ {
+ void * scanner;
+ struct stat buf;
+
+ yylex_init ( &scanner );
+ yyset_in( fopen(filename,"r"), scanner );
+
+ stat( filename, &buf);
+ yyset_extra( &buf, scanner );
+
+ yylex ( scanner );
+ yylex_destroy( scanner );
+ }
+
+.fi
+.RE
+.SH REENTRANT C SCANNERS WITH BISON PURE PARSERS
+.PP
+This section describes the flex features useful when integrating
+.B flex
+with
+.BR bison .
+Skip this section if you are not using
+.B bison
+with your scanner.
+Here we discuss only the flex half of the flex/bison pair.
+We do not discuss bison in any detail.
+For more information about generating pure bison parsers, see the
+.SM GNU
+Bison Manual.
+.PP
+A bison-compatible scanner is generated by declaring
+.B %option reentrant-bison
+or by supplying
+.B -Rb
+when invoking flex.
+This instructs flex that the macros
+.B
+yylval_r
+and
+.B yylloc_r
+may be used. The data types for
+.B yylval_r
+and
+.BR yylloc_r ,
+.RB ( YYSTYPE
+and
+.BR YYLTYPE ),
+are typically defined in a header file, included in section 1 of
+the flex input file.
+.B %option reentrant-bison
+implies
+.B %option reentrant.
+.PP
+If
+.B %option reentrant-bison
+is specified, flex provides support for the functions
+.BR yyget_lval ,
+.BR yyset_lval ,
+.BR yyget_lloc ,
+and
+.BR yyset_lloc ,
+defined below, and the corresponding macros
+.B yylval_r
+and
+.BR yylloc_r ,
+for use within actions.
+.nf
+
+ YYSTYPE * yyget_lval ( void * scanner ) ;
+ void yyset_lval ( YYSTYPE * lvalp, void * scanner );
+
+ YYLTYPE * yyget_lloc ( void * scanner ) ;
+ void yyset_lloc ( YYLTYPE * llocp, void * scanner );
+
+.fi
+.PP
+Accordingly, the declaration of yylex becomes one of the following:
+.ni
+
+ int yylex ( YYSTYPE * lvalp, void * scanner );
+ int yylex ( YYSTYPE * lvalp, YYLTYPE * llocp, void * scanner );
+
+.fi
+.PP
+Note that the macros
+.B yylval_r
+and
+.B yylloc_r
+evaluate to pointers.
+Support for yylloc is optional in bison, so it is optional
+in flex as well. This support is automatically handled
+by flex.
+Specifically, support for yyloc is only present in a flex scanner if the preprocessor
+symbol
+.B YYLTYPE
+is defined.
+.PP
+The following is an example of a flex scanner that is bison-compatible.
+.nf
+
+ /* Scanner for "C" assignment statements... sort of. */
+
+ %{
+ #include "y.tab.h" /* Generated by bison. */
+ %}
+
+ %option reentrant-bison
+
+ %
+
+ [[:digit:]]+ { yylval_r->num = atoi(yytext_r); return NUMBER; }
+ [[:alnum:]]+ { yylval_r->str = strdup(yytext_r); return STRING; }
+ "="|";" { return yytext_r[0]; }
+
+ . { }
+ %
+
+.fi
+.PP
+As you can see, there really is no magic here. We just use
+.B yylval_r
+as we would any other variable. The data type of
+.B yylval_r
+is generated by bison, and included in the file
+.IR y.tab.h .
+Here is the corresponding bison parser:
+.nf
+
+ /* Parser to convert "C" assignments to lisp. */
+
+ %{
+ /* Pass the argument to yyparse through to yylex. */
+ #define YYPARSE_PARAM scanner
+ #define YYLEX_PARAM scanner
+ %}
+
+ %pure_parser
+
+ %union {
+ int num;
+ char* str;
+ }
+ %token <str> STRING
+ %token <num> NUMBER
+ %%
+
+ assignment:
+ STRING '=' NUMBER ';' {
+ printf( "(setf %s %d)", $1, $3 );
+ }
+ ;
+
+.fi
+.SH FUNCTIONS AND MACROS AVAILABLE IN REENTRANT C SCANNERS
+.PP
+Functions available in a reentrant scanner:
+.nf
+
+ char *yyget_text ( void * scanner );
+ int yyget_leng ( void * scanner );
+
+ FILE *yyget_in ( void * scanner );
+ FILE *yyget_out ( void * scanner );
+ int yyget_lineno ( void * scanner );
+ YY_EXTRA_TYPE yyget_extra ( void * scanner );
+
+ void yyset_in ( FILE * in_str , void * scanner );
+ void yyset_out ( FILE * out_str , void * scanner );
+ void yyset_lineno ( int line_number , void * scanner );
+ void yyset_extra ( YY_EXTRA_TYPE user_defined , void * scanner );
+
+.fi
+.IP
+\- There are no "set" functions for yytext_r and yyleng_r. This is intentional.
+.PP
+Macro shortcuts available in actions in a reentrant scanner:
+.nf
+
+ yytext_r
+ yyleng_r
+ yyin_r
+ yyout_r
+ yylineno_r
+ yyextra_r
+
+.fi
+.IP
+\- In a reentrant C scanner, support for yylineno_r is always present
+(i.e., you may access yylineno_r), but the value is never modified by flex unless
+.B %option yylineno
+is enabled. This is to allow the user to maintain the line count herself.
+.PP
+Additional functions and macros made available when
+.B %option reentrant-bison
+.RB ( -Rb )
+is specified:
+.nf
+
+ YYSTYPE * yyget_lval ( void * scanner );
+ YYLTYPE *yyget_lloc ( void * scanner );
+ void yyset_lval ( YYSTYPE * yylvalp , void * scanner );
+ void yyset_lloc ( YYLTYPE * yyllocp , void * scanner );
+
+ yylval_r
+ yylloc_r
+
+.fi
+.IP
+\- Support for yylloc is dependent upon the presence of the preprocessor
+symbol
+.B YYLTYPE.
+Support for yylval relies on the type
+.B YYSTYPE
+to be defined. Typically, these definitions are generated by bison, in a .h file,
+and are included in section 1 of the flex input.
.SH INCOMPATIBILITIES WITH LEX AND POSIX
.I flex
is a rewrite of the AT&T Unix