This article mainly records and summarizes I read the "Python standard Library" a book, the text of this chapter of learning and understanding.
In fact, in Python, some of the ways to use text are particularly common. In general, a class such as String would be used as the most basic standard class in Python.
1.3.6 using group resolution matching
Match.Groups () returns a sequence of strings in the order of the groups that match the strings in the expression.
Use Group () to get a match for a group.
#组解析 text= ' This was a text--with punctuation. ' print ' Input text: ', Text regex=re.compile (R ' (\bt\w+) \w+ (\w+) ') print ' pattern: ', Regex.pattern match=regex.search (text) print ' entire match: ', Match.group (0) print ' Word starting with T: ', Match.group (1) print ' word after T word: ', Match.group (2)
Python expands the syntax of basic groupings by adding named groups (named Group). The group is indicated by its name, which makes it easier to modify the pattern without having to modify the code that uses the matching result at the same time.
Syntax: (? P<name>pattern)
#命名组 print '-' *30 for pattern in [R ' ^ (? p<first_word>\w+) ', R ' (? p<last_word>\w+) \s*$ ', R ' (? p<t_word>\bt\w+) \w+ (? p<other_word>\w+) ', R ' (? p<ends_with_t>\w+t) \b ' ]: regex=re.compile (pattern) match=regex.search (text) print ' Matching "%s" '% pattern print ', match.groups () print ', match.groupdict () print ' \ n '
Use Groupdict () to get a dictionary that maps group names to matching substrings.
#更新后的test_pattern () print '-' *30 def test_pattern (text, patterns=[]): "" " Given the source text and a List of patters, look for matches for each pattern within the text and print them to stdout. "" #look for each pattern in the text and print the results for pattern, desc in patterns: print ' pattern%r (%s) \ n ' % (pattern, desc) print '%r '% text for match in Re.finditer (pattern,text): s=match.start () e= Match.end () prefix= "* (s) print ' %s%r%s '% (Prefix,text[s:e]," * (len (text)-e)) print Match.Groups () if Match.groupdict (): print '%s%s '% (' ' * (len (text)-s), Match.groupdict ()) print return test_pattern ( ' Abbaabbba ', [(R ' A ((*) (b*)) ', ' a followed by 0-n A and 0-n B '),] )
Text in Python (ii)