Example of Python metacharacters usage parsing and python usage examples
The role of the backslash:
To process a metacharacter ^ as a common character, add a backslash
For example:
>>>import re>>>r=r'\^abc'>>>re.findall(r,'^abc ^abc ^abc')['^abc','^abc','^abc']
\ D matches any decimal number, which is equivalent to the class [0-9].
\ D matches any non-numeric character, which is equivalent to the class [^ 0-9]
\ S matches any blank character, which is equivalent to the class [\ t \ n \ r \ f \ v]
\ S matches any non-blank characters, which is equivalent to the class [^ \ t \ n \ r \ f \ v]
\ W matches any alphanumeric character, which is equivalent to the class [a-zA-Z0-9 _]
\ W matches any non-alphanumeric character, which is equivalent to the class [^ a-zA-Z0-9 _]
>>>r=r'[0-9]'>>>re.findall(r,'1234567890')['1','2','3','4','5','6','7','8','9','0']>>>r=r'\d'>>>re.findall(r,'1234567890')['1','2','3','4','5','6','7','8','9','0']
>>> R = R' ^ 010-\ d \ d'> re. findall (r, '010-87654321 ') ['010-87654321']> re. findall (r, '010-8765432 ') []> r = R' ^ 010-\ d {8}' # repeat eight times> re. findall (r, '010-12345678 ') [' ^ 010-12345678 ']
Asterisk :(*)
Match the previous character Zero or more times.
>>>r=r'ab*'>>>re.findall(r,'a')['a']>>>re.findall(r,'ab')['ab']>>>re.findall(r,'abbbbbb')['abbbbbb']
The role of the plus sign: (+)
Match once or more times.
>>>r=r'ab+'>>>re.findall(r,'a')[]>>>re.findall(r,'ab')['ab']>>>re.findall(r,'abbbb')['abbbb']
"-" In the middle of the phone number: (optional)
>>>r=r'^010-*\d{8}'>>>re.findall(r,'010-12345678')['010-12345678']>>>re.findall(r,'01012345678')['01012345678']>>>re.findall(r,'010---12345678')['010---12345678']
Question mark :(?)
Match once or zero times;
>>>r=r'^010-?\d{8}$'>>>re.findall(r,'010--12345678')[]>>>re.findall(r,'010-12345678')['010-12345678']>>>re.findall(r,'01012345678')['01012345678']
Minimum pattern matching:
Greedy pattern matching is as follows:
>>>r=r'ab+'>>>re.findall(r,'abbbbbbbbbbb')['abbbbbbbbbbb']
For non-Greedy match, use the question mark for the minimum match, as shown below:
>>>r=r'ab+?'>>>re.findall(r,'abbbbbbbbbbb')['ab']>>>r=r'ab*?'>>>re.findall(r,'abbbbbbbbbbbb')['a']
Curly braces: ({m, n })
Where m and n are decimal integers. The qualifier indicates at least m duplicates and at most n duplicates.
>>> R = r'a {1, 3} '# indicates that a repeats one to three times >>> re. findall (r, 'A') ['a']> re. findall (r, 'A') ['a']> re. findall (r, 'aaa') ['aaa']> re. findall (r, 'aaa') ['aaa', 'a']
GROUP: "(" and ")"
>>> Import re >>> email = R' \ w {3} @ \ w + (\. com | \. cn) '# define a regular expression ,(\. com | \. cn) indicates a group. The group performs ** or ** operations, or yes. com, or yes. cn> re. match (email, 'www @ owolf.com ') # match <_ sre. SRE_Match object; span = (0, 13), match = 'www @ owolf.com '>>> re. match (email, 'www @ owolf.cn ') <_ sre. SRE_Match object; span = (0, 12), match = 'www @ owolf.cn '>>> re. match (email, 'www @ owolf.org ') >>## return NULL >>> re. findall (email, 'www @ owolf.com ')['. com '] # return data in the group first when matching> re. findall (email, 'www @ owolf.cn ')['. cn '] >>>
>>> S = ''' ajhfa kasjf owolf english = chinese yes no printlafl int = 456 yes floatint = 789 yesowolf english = france yes aklfl ''' # defines the string >>> r = r'owolf english =. + yes '# define a regular expression> re. findall (r, s) # match regular ['owolf english = chinese yes ', 'owolf english = france yes'] >>> r = r'owolf english = (. +) yes '> re. findall (r, s) ['China', 'France '] # Use grouping to return data in the group first, which is often used in crawlers.
Summary
The above is all the content of this article on the usage of Python metacharacters. I hope it will be helpful to you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!