RE Module
You can read the regular expressions you write.
To perform the task according to the expression you wrote.
Using RE to operate the regular
Regular Expressions
Some rules are used to detect whether some strings meet individual requirements and to find content that meets the requirements from a string. Online testing site: http://tool.chinaz.com/regex/
meta-characters: more general match
Metacharacters |
Match content |
. |
Match any character other than line break |
^ |
Match only the beginning of a string |
$ |
Matches only the end of a string |
\w |
Match letters or numbers or underscores |
\s |
Match any white space character |
\d |
Match numbers |
\ n |
Match a line break |
\ t |
Match a tab |
\w |
Match non-alphanumeric and underscore |
\s |
Match non-whitespace characters |
\d |
Match non-numeric |
A|b |
Match character A or b |
() |
Matches an expression within parentheses, and also represents a group |
[ ] |
Characters that match a character group |
[^ ] |
Match all characters except characters in a character group |
Quantifiers:
Quantifiers |
Usage Notes |
* |
Repeat 0 or more times |
+ |
Repeat one or more times |
? |
Repeat 0 or one time |
N |
Repeat n times |
{N,} |
Repeat N or more times |
{N,m} |
Repeat N to M times |
Description of the. *? usage:
. Any character
* Take 0 to unlimited length
? Non-greedy mode
. *?x together to take as little as possible any character, knowing that an X appears
Additional Instructions for use:
* + ? { }:
Note: *,+, and so on are greedy matches, that is, match as much as possible, then add the number to the lazy match
Character Set [][^]:
Group () and or |[^]:
The ID number is a 15 or 18 character string, if 15 bits are all composed of numbers, the first cannot be 0
If it is 18 bits, the first 17 digits are all numbers and the end may be X
Escape character \:
In the regular, there are many special meanings of metacharacters, such as \d,\s, if you want to match the normal "\d" instead of the ' number ' will need to escape the ' \d ', become ' \ \ '
In the PY, whether the regular expression or the content to be matched is in the form of a string, in the string \ also has special meaning, itself also needs to escape, this time will use the R ' \d ' conversion
Greedy match:
Matches the string as long as possible when matching is met
Several common non-greedy matching formats
common methods under the RE module
Import Reret = Re.findall (' A ', ' Eva Egon Yuan ') print (ret) # [' A ', ' a ']ret = Re.findall (' \d+ ', ' dsaglhlkdfh1892494kashdgkj h127839 ') print (ret) # [' 1892494 ', ' 127839 ']# findall receive two parameters: regular expression to match the string # A return value of a list data type: All and the result of this regular match ret = Re.search (' A ' , ' Eavegonyaun '). The group () print (ret) # a# function finds pattern matching within a string until it finds the first match and then returns an object that contains matching information, which can get a matching string by calling the group () method. If the string does not match, no group is called when # returns none. # Search and FindAll difference: # 1.search find one on the return, FindAll is looking for all # 2.findall is the list that returns a result directly, search returns an object ret = Re.match (' A ', ' Eva Egon Yuan ') If Ret:print (Ret.group ()) # means that a ^# with search is added to the regular expression, but does match at the beginning of the string ret = Re.sub (' \d ', ' H ', ' Eva3egon4yuan4 ', 2) p Rint (ret) # evahegonhyuan4# replace the first two digits with Hret = re.subn (' \d ', ' H ', ' Eva3egon4yuan4 ') print (ret) # (' Evahegonhyuanh ', 3) # #将数字替 Instead of ' H ', return the tuple (replace the result, replace the number of times) ret = Re.split ("\d+", "Eva3egon4yuan") print (ret) # [' Eva ', ' Egon ', ' yuan ']ret = Re.split ("(\d+) "," Eva162784673egon44yuan ") print (ret) # [' Eva ', ' 3 ', ' Egon ', ' 4 ', ' Yuan ']# split a string, the delimiter that is matched by default does not appear in the result list, # If a match is placed in a group, the delimiter is placed in the result list# when executing the same regular rule multiple times: obj = Re.compile (' \d{3} ') Ret1 = Obj.search (' abc123eeee ') Ret2 = Obj.findall (' abc123eeee ') print ( Ret1.group () # 123print (ret2) # [' 123 ']# if you match the phone number in the file, you can make such a compilation, save time # Finditer is suitable for the results of more cases, can effectively save memory RET = Re.finditer (' \d ', ' ds3sy4784a ') print (ret) # <callable_iterator object at 0x10195f940>print (Next (ret). Group ()) # View the first result print (Next (ret). Group ()) # View the second result print ([I.group () for I in RET]) # View the left and right results
Group:
If there is a quantifier constraint on a set of regular expressions as a whole, this group of expressions is divided into a group of
# when the group encounters the RE module import Reret1 = Re.findall (' www. ( baidu|oldboy). com ', ' www.baidu.com ') Ret2 = Re.findall (' www. (?: baidu|oldboy). com ', ' www.baidu.com ') print (RET1) print (Ret2) # FindAll will first display the content within the group to return # if you want to ungroup the first effect, in the group at the beginning of the time to add?: # The meaning of the grouping # 1. quantifier constraint on a set of regular rules # 2. The contents of the group are prioritized in the results of a full regular rule # "
16.python module's regular