* Reentrant::
* Lex and Posix::
* Memory Management::
+* Serialized Tables::
* Diagnostics::
* Limitations::
* Bibliography::
* Overriding The Default Memory Management::
* A Note About yytext And Memory::
+Serialized Tables
+
+* Creating Serialized Tables::
+* Loading Serialized Tables::
+* Freeing Serialized Tables::
+* Tables File Format::
+
FAQ
* When was flex born?::
somehow. Our experience has shown that a garbage collection mechanism or a
pooled memory mechanism will save you a lot of grief when writing parsers.
+@node Serialized Tables
+@chapter Serialized Tables
+@cindex serialization
+@cindex memory, serialized tables
+
+A @code{flex} scanner has the ability to save the DFA tables to a file, and
+load them runtime when needed. The motivation for this feature is to reduce
+the runtime memory footprint. Traditionally, these tables are compiled into
+the scanner as C arrays, and are sometimes quite large. Since the tables are
+compiled into the scanner, the memory used by the tables can never be freed.
+This is a waste of memory, especially if an application uses several scanners,
+but none of them at the same time.
+
+The serialization feature allows the tables to be loaded at runtime, before
+scanning begins. The tables may be discarded when scanning is finished.
+
+@menu
+* Creating Serialized Tables::
+* Loading Serialized Tables::
+* Freeing Serialized Tables::
+* Tables File Format::
+@end menu
+
+@node Creating Serialized Tables
+@section Creating Serialized Tables
+@cindex tables, creating serialized
+@cindex creating serialized tables
+
+how to create serialized tables
+
+@node Loading Serialized Tables
+@section Loading Serialized Tables
+@cindex tables, loading
+@cindex loading tables at runtime
+
+how to load serialized tables
+
+@node Freeing Serialized Tables
+@section Freeing Serialized Tables
+@cindex tables, freeing
+@cindex freeing tables
+
+how to free serialized tables
+
+@node Tables File Format
+@section Tables File Format
+@cindex tables, file format
+@cindex file format, serialized tables
+
+This section defines the file format of serialized @code{flex} tables.
+
+The tables format allows for one or more sets of tables to be
+specified, where each set corresponds to a given scanner. Scanners are
+indexed by name, as described below. The file format is as follows:
+
+@example
+@verbatim
+ TABLE SET
+ +-------------------------------+
+ Header | uint32 th_magic; |
+ | uint32 th_hsize; |
+ | uint32 th_ssize; |
+ | uint8 th_version[4];|
+ | uint16 th_flags; |
+ | char th_name[]; |
+ | char th_pad64[]; |
+ +-------------------------------+
+ Table 1 | uint16 t_id; |
+ | uint16 t_flags; |
+ | uint32 t_len[2]; |
+ | void t_data[]; |
+ | char t_pad64[]; |
+ +-------------------------------+
+ Table 2 | |
+ . . .
+ . . .
+ . . .
+ . . .
+ Table n | |
+ +-------------------------------+
+ TABLE SET 2
+ .
+ .
+ .
+ TABLE SET N
+@end verbatim
+@end example
+
+The above diagram shows that a complete set consists of a header followed by
+multiple tables. Furthermore, multiple complete sets may be present in the same
+file, each set with its own header and tables. The sets are contiguous in the
+file. The only way to know if another set follows is to check the next four
+bytes for the magic number. The header and tables sections are padded to 64-bit
+boundaries. Below we describe each field in detail. This format does not
+specify how the scanner will expand the given data, i.e., data may be
+serialized as int16, but expanded to an int32 array.
+Remember, @emph{all integer values are in network byte order}.
+
+@noindent
+Fields of a table header:
+
+@table @code
+@item th_magic
+Magic number, always 0xF13C57B1.
+@item th_hsize
+Size of this entire header, in bytes.
+@item th_ssize
+Size of this entire set, in bytes.
+@item th_version[4]
+Flex version numbers: major,minor,r1,r2. Currently flex only specfies three
+revision levels, so 'r2' will simply be zero, e.g. 2.5.13.0
+@item th_flags
+Bit flags for this table set. Currently unused.
+@item th_name[]
+Contains the name of this table set. The default is @samp{yytables},
+and is prefixed accordingly. Must be NUL-terminated.
+@item th_pad64[]
+Zero or more NUL bytes, padding the entire header to the next 64-bit boundary.
+@end table
+
+@noindent
+Fields of a table:
+
+@table @code
+@item t_id
+Specifies the table identifier. Possible values are:
+@table @code
+@item YYT_ID_ACCEPT (0x01)
+@code{yy_accept}
+@item YYT_ID_BASE (0x02)
+@code{yy_base}
+@item YYT_ID_CHK (0x03)
+@code{yy_chk}
+@item YYT_ID_DEF (0x04)
+@code{yy_def}
+@item YYT_ID_EC (0x05)
+@code{yy_ec }
+@item YYT_ID_META (0x06)
+@code{yy_meta}
+@item YYT_ID_NUL_TRANS (0x07)
+@code{yy_NUL_trans}
+@item YYT_ID_NXT (0x08)
+@code{yy_nxt}
+@item YYT_ID_RULE_CAN_MATCH_EOL (0x09)
+@code{yy_rule_can_match_eol}
+@item YYT_ID_START_STATE_LIST (0x0A)
+@code{yy_start_state_list}. This array is handled specially because it is an array of
+pointers to srtucts. See the @code{t_flags} field below.
+@item YYT_ID_TRANSITION (0x0B)
+@code{yy_transition}. This array is handled specially because it is an array of
+structs. See the the @code{t_len} field below.
+@end table
+
+@item t_flags
+Bit flags describing how to interpret the data in @code{t_data}.
+The data arrays are one-dimensional by default, but may be
+two dimensional as specified in the @code{t_len} field.
+
+@table @code
+@item YYT_DATA8 (0x01)
+The data is serialized as an array of type int8.
+@item YYT_DATA16 (0x02)
+The data is serialized as an array of type int16.
+@item YYT_DATA32 (0x04)
+The data is serialized as an array of type int32.
+@item YYT_PTRANS (0x08)
+The data is a list of indexes of entries in the expanded @code{yy_transition}
+array. Each index should be expanded to a pointer to the corresponding entry
+in the @code{yy_transition} array. It is not guaranteed that the
+@code{yy_transition} array has already been seen, but we can still calculate
+the offsets.
+@end table
+
+@item t_len[2]
+Specifies the number of elements in the array.
+If @code{t_len[0]} is non-zero, then the data is a two-dimensional array.
+Otherwise, the data is a one-dimensional array. @code{t_len[0]} contains the
+number of elements in the higher dimensional array, and @code{t_len[1]} contains the number
+of elements in the lowest dimension. Conceptually, @code{t_data} is either
+@code{int t_data[t_len[1]]}, or @code{int t_data[t_len[0]][t_len[1]]}.
+It is possible for both values of @code{t_len} to be zero, in which case
+@code{t_data} is a zero length array, and no data is loaded, i.e., this table
+is simply skipped. Note that if @code{t_id} equals
+@code{YYT_ID_TRANSITION}, then @code{t_len[1]} is the number of structs in
+the array. Each struct contains two integer fields, with the @code{yy_verify} field first,
+followed by the @code{yy_nxt} field.
+
+@item t_data[]
+The table data. This array may be a one- or two-dimensional array, of type int8,
+int16, or int32, depending upon the values in the @code{t_flags} and
+the @code{t_len} fields.
+@item t_pad64[]
+Zero or more NUL bytes, padding the table to the next 64-bit boundary.
+@end table
+
+
@node Diagnostics
@chapter Diagnostics