-Re2c has two options for submatch extraction.
+Submatch extraction in re2c is based on the lookahead-TDFA algorithm described
+in the
+`Tagged Deterministic Finite Automata with Lookahead <https://arxiv.org/abs/1907.08837>`_
+paper. The algorithm uses the notion of "tags" --- position markers that denote
+positions in the regular expression for which the lexer must determine the
+corresponding position in the input string.
+Re2c provides two options for submatch extraction: the first one allows to use
+raw tags, and the second one allows to use the more conventional parenthesized
+capturing groups.
The first option is ``-T --tags``. With this option one can use standalone tags
of the form ``@stag`` and ``#mtag``, where ``stag`` and ``mtag`` are arbitrary
-used-defined names. Tags can be used anywhere inside of a regular expression;
-semantically they are just position markers. Tags of the form ``@stag`` are
+used-defined names. Tags can be used anywhere inside of a regular expression.
+Tags of the form ``@stag`` are
called s-tags: they denote a single submatch value (the last input position
where this tag matched). Tags of the form ``#mtag`` are called m-tags: they
denote multiple submatch values (the whole history of repetitions of this tag).
POSIX-compliant disambiguation: each subexpression matches as long as possible,
and subexpressions that start earlier in regular expression have priority over
those starting later. Capturing groups are translated into s-tags under the
-hood, therefore we use the word "tag" to describe them as well.
+hood.
-With both ``-P --posix-captures`` and ``T --tags`` options re2c uses efficient
-submatch extraction algorithm described in the
-`Tagged Deterministic Finite Automata with Lookahead <https://arxiv.org/abs/1907.08837>`_
-paper. The overhead on submatch extraction in the generated lexer grows with the
+The overhead on submatch extraction in the generated lexer grows with the
number of tags --- if this number is moderate, the overhead is barely
noticeable. In the lexer tags are implemented using a number of tag variables
generated by re2c. There is no one-to-one correspondence between tag variables