From: Serhiy Storchaka Date: Thu, 4 Jan 2018 09:08:24 +0000 (+0200) Subject: bpo-32211: Document the existing bug in re.findall() and re.finditer(). (#4695) X-Git-Tag: v3.6.5rc1~170 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=1e6d8525f9dd3dcdc83adb93b164082c8b95d17a;p=python bpo-32211: Document the existing bug in re.findall() and re.finditer(). (#4695) --- diff --git a/Doc/library/re.rst b/Doc/library/re.rst index fae8945f8b..874c8ddce6 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -719,14 +719,21 @@ form. Splitting on a pattern that could match an empty string now raises a warning. Patterns that can only match empty strings are now rejected. + .. function:: findall(pattern, string, flags=0) Return all non-overlapping matches of *pattern* in *string*, as a list of strings. The *string* is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than - one group. Empty matches are included in the result unless they touch the - beginning of another match. + one group. Empty matches are included in the result. + + .. note:: + + Due to the limitation of the current implementation the character + following an empty match is not included in a next match, so + ``findall(r'^|\w+', 'two words')`` returns ``['', 'wo', 'words']`` + (note missed "t"). This is changed in Python 3.7. .. function:: finditer(pattern, string, flags=0) @@ -734,8 +741,7 @@ form. Return an :term:`iterator` yielding :ref:`match objects ` over all non-overlapping matches for the RE *pattern* in *string*. The *string* is scanned left-to-right, and matches are returned in the order found. Empty - matches are included in the result unless they touch the beginning of another - match. + matches are included in the result. See also the note about :func:`findall`. .. function:: sub(pattern, repl, string, count=0, flags=0)