Turn from: 81117919
Here is a pit that I encountered when I matched the string with FindAll, and shared it for everyone to jump in the pit.
Examples:
: The results of regular A and regular B two formulas are different.
The ?:
effect of that is to turn the capture grouping into a non-capturing grouping.
What are capturing groups and non-capturing groups?
(qq|163|126)---> Such a separate parenthesis is a capturing group
(?: qq|163|126)---> So in the original grouping ?:
, the capturing group is transformed into a non-capturing group.
The source code parsing of FindAll function
Here we can look at the FindAll function of the source parsing, with help(re.findall)
view, get the following translation:
Vernacular Understanding:
FindAll function, that is, in a regular match, if there is a grouping, just match the contents of the group, and then return to the list of groups; If you have more than one grouping, consider each group as a unit, combine it into a tuple, and return a list with multiple tuples.
Examples of the answer:
After the capturing and non-capturing groups are distinguished, the problem of matching mailboxes in the first example above is resolved.
The formula for regular A: R "\[email protected] (qq|163|126). com" is a matching capture group, so got [' QQ ', ' 163 ', ' 126 '] this list;
The formula of regular B: R "\[email protected] (?: qq|163|126). com",?: Turns the capturing group into a non-capturing group, so that the formula can be matched from beginning to end, so the successful ['[email protected] ', '[email protected]', '[email protected]' This list of mailboxes.
Python matches the differences using FindAll, capturing grouping (XXX) and non-capturing groupings (?: XXX)