--- /dev/null
+1 TDFA(0) 45 452 250 63 135 339 247 12.86 10.27 99.09 55.83
+2 TDFA(0) 18 70 32 15 31 41 31 7.66 5.47 71.60 33.90
+3 TDFA(0) 23 252 152 39 75 203 155 10.01 6.01 111.76 73.75
+4 TDFA(0) 16 26 17 11 19 23 19 8.34 3.55 102.72 59.84
+
+
+1 TDFA(1) 42 457 183 55 139 213 151 6.43 5.59 67.00 27.93
+2 TDFA(1) 16 73 33 15 35 41 31 5.30 3.83 63.30 26.74
+3 TDFA(1) 20 256 115 35 75 138 103 6.78 3.23 104.36 51.00
+4 TDFA(1) 13 28 19 11 19 25 23 6.04 3.12 100.28 47.85
+
+
+1 DFA -- 414 135 35 111 145 91 4.96 4.46 62.04 23.67
+2 DFA -- 69 25 15 31 31 23 4.90 3.34 62.00 23.59
+3 DFA -- 198 67 23 55 73 55 7.06 3.19 97.87 51.37
+4 DFA -- 22 10 11 15 14 15 5.89 2.66 97.95 47.01
+
+
+1 TDFA(0) 45 452 295 63 59 352 267 11.95 10.30 65.47 36.95
+2 TDFA(0) 18 70 31 15 19 31 31 7.12 7.30 31.81 17.44
+3 TDFA(0) 23 252 165 39 35 181 151 8.36 8.58 39.51 31.81
+4 TDFA(0) 16 26 20 11 11 22 23 7.14 6.67 23.19 18.73
+
+
+1 TDFA(1) 42 457 171 55 51 144 111 6.01 5.40 15.94 10.53
+2 TDFA(1) 16 73 29 15 19 29 27 5.24 4.43 13.50 8.84
+3 TDFA(1) 20 256 127 55 31 130 107 5.21 4.81 12.02 10.01
+4 TDFA(1) 13 28 17 11 11 19 19 4.02 3.08 8.56 6.90
+
+
+1 DFA -- 414 123 35 39 75 51 4.71 4.76 10.88 5.61
+2 DFA -- 69 19 11 15 15 15 4.64 3.94 11.00 5.77
+3 DFA -- 198 60 19 23 39 35 4.04 4.06 9.13 8.17
+4 DFA -- 22 7 11 11 8 11 3.90 2.52 8.00 4.40
+
+
+1 TDFA(0) 2054 625 816 275 267 1107 839 14.11 13.25 105.58 59.60
+2 TDFA(0) 72 106 57 23 55 73 55 8.61 6.77 72.96 34.63
+3 TDFA(0) 611 280 426 127 151 536 463 10.39 7.51 127.35 75.23
+4 TDFA(0) 79 29 33 19 23 43 39 7.43 4.05 105.06 61.74
+
+
+1 TDFA(1) 149 462 200 63 147 233 167 6.47 5.90 68.43 29.09
+2 TDFA(1) 44 82 39 19 43 49 39 6.00 5.39 63.79 27.37
+3 TDFA(1) 64 256 131 43 87 156 123 6.74 3.54 103.91 51.08
+4 TDFA(1) 40 31 28 15 23 36 31 6.27 3.32 101.79 48.15
+
\begin{multicols}{2}
From these examples we can draw the following conclusions.
-First, TDFA(1) are generally better than TDFA(0): delaying register operations allows to get rid of many conflicts.
+First, TDFA(1) is generally better than TDFA(0): delaying register operations allows to get rid of many conflicts.
Second, both kinds of automata are only suitable for RE with modest levels of ambiguity
and low submatch detalisation: TDFA can be applied to full parsing, but other methods would probably outperform them.
However, RE of such form are very common in practice and for them TDFA can be very efficient.
\subsection*{Fixed tags}
-It may happen that two tags in TRE are bound: separated by a fixed number of characters, so that
+It may happen that two tags in TRE are separated by a fixed number of characters:
each offset of one tag is equal to the corresponding offset of the other tag plus some static offset.
%the value of one tag is always equal to the value of the other plus some static offset.
In this case we can track only one of the tags; we say that the second tag is \emph{fixed} on the first one.
\hline \hline
\multicolumn{12}{|c|}{re2c} \\
\hline
- TDFA(0) & 45 & 452 & 250 & 63 & 135 & 339 & 247 & 12.88 & 10.31 & 99.12 & 55.91 \\
- TDFA(1) & 42 & 457 & 183 & 55 & 139 & 213 & 151 & 6.42 & 5.59 & 67.04 & 27.96 \\
- DFA & -- & 414 & 135 & 35 & 111 & 145 & 91 & 4.96 & 4.46 & 62.15 & 23.74 \\
-% TDFA(0) & 45 & 452 & 255712 & 63544 & 137320 & 346408 & 252024 & 12.88 & 10.31 & 99.12 & 55.91 \\
-% TDFA(1) & 42 & 457 & 186600 & 55352 & 141416 & 217160 & 153720 & 6.42 & 5.59 & 67.04 & 27.96 \\
-% DFA & -- & 414 & 137816 & 34864 & 112728 & 148048 & 92256 & 4.96 & 4.46 & 62.15 & 23.74 \\
+ TDFA(0) & 45 & 452 & 250 & 63 & 135 & 339 & 247 & 12.86 & 10.27 & 99.09 & 55.83 \\
+ TDFA(1) & 42 & 457 & 183 & 55 & 139 & 213 & 151 & 6.43 & 5.59 & 67.00 & 27.93 \\
+ DFA & -- & 414 & 135 & 35 & 111 & 145 & 91 & 4.96 & 4.46 & 62.04 & 23.67 \\
\hline \hline
\multicolumn{12}{|c|}{re2c -b} \\
\hline
- TDFA(0) & 45 & 452 & 295 & 63 & 59 & 352 & 267 & 11.96 & 10.31 & 65.53 & 36.98 \\
- TDFA(1) & 42 & 457 & 171 & 55 & 51 & 144 & 111 & 6.01 & 5.40 & 15.96 & 10.59 \\
- DFA & -- & 414 & 123 & 35 & 39 & 75 & 51 & 4.73 & 4.78 & 10.93 & 5.63 \\
-% TDFA(0) & 45 & 452 & 301968 & 63544 & 59496 & 360136 & 272504 & 11.96 & 10.31 & 65.53 & 36.98 \\
-% TDFA(1) & 42 & 457 & 174903 & 55352 & 51304 & 147016 & 112760 & 6.01 & 5.40 & 15.96 & 10.59 \\
-% DFA & -- & 414 & 125389 & 34864 & 39000 & 76272 & 51296 & 4.73 & 4.78 & 10.93 & 5.63 \\
+ TDFA(0) & 45 & 452 & 295 & 63 & 59 & 352 & 267 & 11.95 & 10.30 & 65.47 & 36.95 \\
+ TDFA(1) & 42 & 457 & 171 & 55 & 51 & 144 & 111 & 6.01 & 5.40 & 15.94 & 10.53 \\
+ DFA & -- & 414 & 123 & 35 & 39 & 75 & 51 & 4.71 & 4.76 & 10.88 & 5.61 \\
\hline \hline
\multicolumn{12}{|c|}{re2c --no-optimize-tags} \\
\hline
- TDFA(0) & 2054 & 625 & 816 & 275 & 267 & 1107 & 839 & 14.14 & 13.24 & 105.87 & 59.71 \\
- TDFA(1) & 149 & 462 & 200 & 63 & 147 & 233 & 167 & 6.64 & 5.90 & 68.50 & 29.39 \\
-% TDFA(0) & 2054 & 625 & 835285 & 280632 & 272488 & 1132616 & 858232 & 14.14 & 13.24 & 105.87 & 59.71 \\
-% TDFA(1) & 149 & 462 & 204119 & 63544 & 149608 & 238568 & 170104 & 6.64 & 5.90 & 68.50 & 29.39 \\
+ TDFA(0) & 2054 & 625 & 816 & 275 & 267 & 1107 & 839 & 14.11 & 13.25 & 105.58 & 59.60 \\
+ TDFA(1) & 149 & 462 & 200 & 63 & 147 & 233 & 167 & 6.47 & 5.90 & 68.43 & 29.09 \\
\hline
\end{tabular}\\*
\medskip
\hline \hline
\multicolumn{12}{|c|}{re2c} \\
\hline
- TDFA(0) & 18 & 70 & 32 & 15 & 31 & 41 & 31 & 7.65 & 5.50 & 71.60 & 33.96 \\
- TDFA(1) & 16 & 73 & 33 & 15 & 35 & 41 & 31 & 5.31 & 3.83 & 63.36 & 26.78 \\
- DFA & -- & 69 & 25 & 15 & 31 & 31 & 23 & 4.90 & 3.34 & 62.12 & 23.64 \\
-% TDFA(0) & & & & 14392 & 30816 & 41160 & 30840 & 7.65 & 5.50 & 71.60 & 33.96 \\
-% TDFA(1) & & & & 14392 & 34912 & 41704 & 30840 & 5.31 & 3.83 & 63.36 & 26.78 \\
-% DFA & -- & 69 & 24937 & 14384 & 30808 & 31280 & 22624 & 4.90 & 3.34 & 62.12 & 23.64 \\
+ TDFA(0) & 18 & 70 & 32 & 15 & 31 & 41 & 31 & 7.66 & 5.47 & 71.60 & 33.90 \\
+ TDFA(1) & 16 & 73 & 33 & 15 & 35 & 41 & 31 & 5.30 & 3.83 & 63.30 & 26.74 \\
+ DFA & -- & 69 & 25 & 15 & 31 & 31 & 23 & 4.90 & 3.34 & 62.00 & 23.59 \\
\hline \hline
\multicolumn{12}{|c|}{re2c -b} \\
\hline
- TDFA(0) & 18 & 70 & 31 & 15 & 19 & 31 & 31 & 7.12 & 7.31 & 31.85 & 17.47 \\
- TDFA(1) & 16 & 73 & 29 & 15 & 19 & 29 & 27 & 5.25 & 4.42 & 13.52 & 8.86 \\
- DFA & -- & 69 & 19 & 11 & 15 & 15 & 15 & 4.66 & 3.96 & 11.00 & 5.79 \\
-% TDFA(0) & & & & 14392 & 18528 & 31336 & 30840 & 7.12 & 7.31 & 31.85 & 17.47 \\
-% TDFA(1) & & & & 14392 & 18528 & 29288 & 26744 & 5.25 & 4.42 & 13.52 & 8.86 \\
-% DFA & -- & 69 & 18472 & 10288 & 14424 & 14832 & 14432 & 4.66 & 3.96 & 11.00 & 5.79 \\
+ TDFA(0) & 18 & 70 & 31 & 15 & 19 & 31 & 31 & 7.12 & 7.30 & 31.81 & 17.44 \\
+ TDFA(1) & 16 & 73 & 29 & 15 & 19 & 29 & 27 & 5.24 & 4.43 & 13.50 & 8.84 \\
+ DFA & -- & 69 & 19 & 11 & 15 & 15 & 15 & 4.64 & 3.94 & 11.00 & 5.77 \\
\hline \hline
\multicolumn{12}{|c|}{re2c --no-optimize-tags} \\
\hline
- TDFA(0) & 72 & 106 & 57 & 23 & 55 & 73 & 55 & 8.61 & 6.77 & 73.05 & 34.68 \\
- TDFA(1) & 44 & 82 & 39 & 19 & 43 & 49 & 39 & 6.01 & 5.38 & 63.87 & 27.44 \\
-% TDFA(0) & 72 & 106 & 57956 & 22584 & 55400 & 73928 & 55416 & 8.61 & 6.77 & 73.05 & 34.68 \\
-% TDFA(1) & 44 & 82 & 39674 & 18488 & 43112 & 49480 & 39032 & 6.01 & 5.38 & 63.87 & 27.44 \\
+ TDFA(0) & 72 & 106 & 57 & 23 & 55 & 73 & 55 & 8.61 & 6.77 & 72.96 & 34.63 \\
+ TDFA(1) & 44 & 82 & 39 & 19 & 43 & 49 & 39 & 6.00 & 5.39 & 63.79 & 27.37 \\
\hline
\end{tabular}\\*
\medskip
\hline \hline
\multicolumn{12}{|c|}{re2c} \\
\hline
- TDFA(0) & 23 & 252 & 152 & 39 & 75 & 203 & 155 & 10.03 & 6.10 & 111.90 & 73.81 \\
- TDFA(1) & 20 & 256 & 115 & 35 & 75 & 138 & 103 & 6.75 & 3.24 & 104.56 & 50.90 \\
- DFA & -- & 198 & 67 & 23 & 55 & 73 & 55 & 7.05 & 3.21 & 97.89 & 51.43 \\
-% TDFA(0) & 23 & 252 & 154776 & 38960 & 75864 & 207600 & 157792 & 10.03 & 6.10 & 111.90 & 73.81 \\
-% TDFA(1) & 20 & 256 & 117498 & 34864 & 75864 & 140560 & 104544 & 6.75 & 3.24 & 104.56 & 50.90 \\
-% DFA & -- & 198 & 67617 & 22576 & 55384 & 74384 & 55392 & 7.05 & 3.21 & 97.89 & 51.43 \\
+ TDFA(0) & 23 & 252 & 152 & 39 & 75 & 203 & 155 & 10.01 & 6.01 & 111.76 & 73.75 \\
+ TDFA(1) & 20 & 256 & 115 & 35 & 75 & 138 & 103 & 6.78 & 3.23 & 104.36 & 51.00 \\
+ DFA & -- & 198 & 67 & 23 & 55 & 73 & 55 & 7.06 & 3.19 & 97.87 & 51.37 \\
\hline \hline
\multicolumn{12}{|c|}{re2c -b} \\
\hline
- TDFA(0) & 23 & 252 & 165 & 39 & 35 & 181 & 151 & 8.40 & 8.56 & 39.56 & 31.84 \\
- TDFA(1) & 20 & 256 & 127 & 55 & 31 & 130 & 107 & 5.23 & 4.83 & 12.04 & 10.02 \\
- DFA & -- & 198 & 60 & 19 & 23 & 39 & 35 & 4.05 & 4.08 & 9.23 & 8.19 \\
-% TDFA(0) & 23 & 252 & 168684 & 38960 & 34904 & 186704 & 153696 & 8.40 & 8.56 & 39.56 & 31.84 \\
-% TDFA(1) & 20 & 256 & 129322 & 55344 & 30808 & 132912 & 108640 & 5.23 & 4.83 & 12.04 & 10.02 \\
-% DFA & -- & 198 & 60759 & 18480 & 22616 & 39376 & 34912 & 4.05 & 4.08 & 9.23 & 8.19 \\
+ TDFA(0) & 23 & 252 & 165 & 39 & 35 & 181 & 151 & 8.36 & 8.58 & 39.51 & 31.81 \\
+ TDFA(1) & 20 & 256 & 127 & 55 & 31 & 130 & 107 & 5.21 & 4.81 & 12.02 & 10.01 \\
+ DFA & -- & 198 & 60 & 19 & 23 & 39 & 35 & 4.04 & 4.06 & 9.13 & 8.17 \\
\hline \hline
\multicolumn{12}{|c|}{re2c --no-optimize-tags} \\
\hline
- TDFA(0) & 611 & 280 & 426 & 127 & 151 & 536 & 463 & 10.41 & 7.56 & 127.48 & 75.46 \\
- TDFA(1) & 64 & 256 & 131 & 43 & 87 & 156 & 123 & 6.74 & 3.55 & 103.98 & 51.12 \\
-% TDFA(0) & 611 & 280 & 435350 & 129072 & 153696 & 548272 & 473184 & 10.41 & 7.56 & 127.48 & 75.46 \\
-% TDFA(1) & 64 & 256 & 133518 & 43056 & 88160 & 159248 & 125024 & 6.74 & 3.55 & 103.98 & 51.12 \\
+ TDFA(0) & 611 & 280 & 426 & 127 & 151 & 536 & 463 & 10.39 & 7.51 & 127.35 & 75.23 \\
+ TDFA(1) & 64 & 256 & 131 & 43 & 87 & 156 & 123 & 6.74 & 3.54 & 103.91 & 51.08 \\
\hline
\end{tabular}\\*
\medskip
\hline \hline
\multicolumn{12}{|c|}{re2c} \\
\hline
- TDFA(0) & 16 & 26 & 17 & 11 & 19 & 23 & 19 & 8.34 & 3.57 & 102.84 & 59.88 \\
- TDFA(1) & 13 & 28 & 19 & 11 & 19 & 25 & 23 & 6.06 & 3.14 & 100.33 & 48.02 \\
- DFA & -- & 22 & 10 & 11 & 15 & 14 & 15 & 5.91 & 2.68 & 98.10 & 47.25 \\
-% TDFA(0) & & & & 10288 & 18520 & 22960 & 18528 & 8.34 & 3.57 & 102.84 & 59.88 \\
-% TDFA(1) & & & & 10288 & 18520 & 25424 & 22624 & 6.06 & 3.14 & 100.33 & 48.02 \\
-% DFA & -- & & & 10288 & 14424 & 14256 & 14432 & 5.91 & 2.68 & 98.10 & 47.25 \\
+ TDFA(0) & 16 & 26 & 17 & 11 & 19 & 23 & 19 & 8.34 & 3.55 & 102.72 & 59.84 \\
+ TDFA(1) & 13 & 28 & 19 & 11 & 19 & 25 & 23 & 6.04 & 3.12 & 100.28 & 47.85 \\
+ DFA & -- & 22 & 10 & 11 & 15 & 14 & 15 & 5.89 & 2.66 & 97.95 & 47.01 \\
\hline \hline
\multicolumn{12}{|c|}{re2c -b} \\
\hline
- TDFA(0) & 16 & 26 & 20 & 11 & 11 & 22 & 23 & 7.17 & 6.66 & 23.21 & 18.77 \\
- TDFA(1) & 13 & 28 & 17 & 11 & 11 & 19 & 19 & 4.05 & 3.09 & 8.59 & 6.94 \\
- DFA & -- & 22 & 7 & 11 & 11 & 8 & 11 & 3.92 & 2.56 & 8.06 & 4.42 \\
-% TDFA(0) & & & & 10288 & 10328 & 22352 & 22624 & 7.17 & 6.66 & 23.21 & 18.77 \\
-% TDFA(1) & & & & 10288 & 10328 & 18960 & 18528 & 4.05 & 3.09 & 8.59 & 6.94 \\
-% DFA & -- & & 6483 & 10288 & 10328 & 7888 & 10336 & 3.92 & 2.56 & 8.06 & 4.42 \\
+ TDFA(0) & 16 & 26 & 20 & 11 & 11 & 22 & 23 & 7.14 & 6.67 & 23.19 & 18.73 \\
+ TDFA(1) & 13 & 28 & 17 & 11 & 11 & 19 & 19 & 4.02 & 3.08 & 8.56 & 6.90 \\
+ DFA & -- & 22 & 7 & 11 & 11 & 8 & 11 & 3.90 & 2.52 & 8.00 & 4.40 \\
\hline \hline
\multicolumn{12}{|c|}{re2c --no-optimize-tags} \\
\hline
- TDFA(0) & 79 & 29 & 33 & 19 & 23 & 43 & 39 & 7.46 & 3.94 & 105.22 & 61.72 \\
- TDFA(1) & 40 & 31 & 28 & 15 & 23 & 36 & 31 & 6.29 & 3.33 & 102.00 & 48.22 \\
-% TDFA(0) & 79 & 29 & 33745 & 18480 & 22624 & 43504 & 39008 & 7.46 & 3.94 & 105.22 & 61.72 \\
-% TDFA(1) & 40 & 31 & 28013 & 14384 & 22624 & 36080 & 30816 & 6.29 & 3.33 & 102.00 & 48.22 \\
+ TDFA(0) & 79 & 29 & 33 & 19 & 23 & 43 & 39 & 7.43 & 4.05 & 105.06 & 61.74 \\
+ TDFA(1) & 40 & 31 & 28 & 15 & 23 & 36 & 31 & 6.27 & 3.32 & 101.79 & 48.15 \\
\hline
\end{tabular}\\*
\medskip
\item \! [Cox10] Russ Cox, \textit{"Regular Expression Matching in the Wild"}, March 2010, \\
https://swtch.com/\textasciitilde rsc/regexp/regexp3.html
+ \item https://github.com/google/re2/issues/146
\end{enumerate}