Regular expressions when writing a program or Web page that handles strings, there is often a need to find strings that match certain complex rules. Regular expressions are the tools used to describe these rules. In other words, the regular expression is the code that records the text rule. Common syntax
Special usages and phenomena
followed by one after the quantifier? The elimination of greedy-matching non-greedy (lazy) mode is most commonly used . *?x matches any character until an x is found
Python's re module has eight main methods:
Import reret=re.findall (' \d+ ', ' FSF4131S4FSG74DSF ') # matches all required content in the string print (ret) #参数: Regular An expression, a string, that returns a list of all occurrences of a match if none is an empty list # result: [' 4131 ', ' 4 ', ']ret2=re.search ' (' \d+ ', ' fng231523jsk6313 ') # Match the first desired content in the string print (Ret2) #参数: Regular expression, string, if the match is correct, the object that returns the matching result is viewed with group, and if not, returns Noneprint (Ret2.grou P ()) #结果: # <_sre. Sre_match object; Span= (3, 9), match= ' 231523 ' ># 231523ret3=re.match (' \d+ ', ' 3fsfs ') #匹配字符串中开头的内容print (RET3) #参数: Regular expression, string, if the match is correct, the object that returns the matching result is viewed with group, and if not, returns Noneprint (Ret3.group ()) #结果: # <_sre. Sre_match object; span= (0, 1), match= ' 3 ' ># 3ret4=re.split (' \d+ ', ' 3fsfs565df6s5dg6gd2 ') #按需要对字符串进行切割print (RET4) #参数: A regular expression, the string returns a list of the remaining values of the cut, and when the cut value is at the beginning and end, a null character is left in the list Ret4.remove (') ret4.remove (") print (RET4) res=re.split ( ' (\d+) ', ' FSFS5DF6S5DG6GD ') #, the regular expressions can be grouped (priority display) will be displayed by the cut value, print (res) #结果: # [', ' Fsfs ', ' df ', ' s ', ' DG ', ' GD ', ']# [', ' Fsfs ', ' df ', ' s ', ' DG ',' GD ']# [' Fsfs ', ' 5 ', ' df ', ' 6 ', ' s ', ' 5 ', ' DG ', ' 6 ', ' GD ']ret5=re.sub (' \d+ ', ' B ', ' afssggsdg45sg45s4g5s ', 3) #按需要对字符串进行替换 default You can set the replacement number print (RET5) #参数: Regular expression, replaced content, string returns the replaced string # Result: AFSSGGSDGBSGBSBG5SR Et6=re.subn (' \d+ ', ' B ', ' fsd4gg5h4gzx323s ', 3) #按需要对字符串进行替换 default compliance with all replacements, you can set the number of substitutions print (RET6) #参数: Regular expression, substituted content, string returns a tuple of substituted strings and replacements # results: (' fsdbggbhbgzx323s ', 3) res=re.compile ('-[1-9]\d*[.\d]*|-0\.\d*[1- 9]\d* ') #对正则表达式进行编译, need to use the time can be used in conjunction with other methods Ret7=res.findall (' -205fsf-21sd-2.5g6gg5g ') print (RET7) #结果: ['-205 ', '-21 ', '- 2.5 ']ret8=re.finditer (' \d+ ', ' f56s5f3s ') # usage is similar to FindAll, but the return value is a generator, and the object to which the generator iterates to get the matching result is then used for the group value for I in Ret8:print (i) Print (I.group ()) #结果: # <_sre. Sre_match object; Span= (1, 3), match= ' ># 56# <_sre. Sre_match object; Span= (4, 5), match= ' 5 ' ># AA <_sre. Sre_match object; Span= (6, 7), match= ' 3 ' ># 3
Group priority
FindAll will first display the contents of the group, if you want to ungroup it first, (?: Regular expression)
Split encounters a group that retains the cut-off content within the group
Search if there is a grouping in search, you can get the matching content in Group by group (N)
Import Reret = Re.findall (' -0\.\d+|-[1-9]\d* (\.\d+)? ', ' -1asdada-200 ') print (ret) #结果: [' ', ']ret = Re.findall (' -0\.\d+ |-[1-9]\d* (?: \. \d+)? ', ' -1asdada-200 ') print (ret) #结果 ['-1 ', ' -200 ']ret = Re.findall (' www.baidu.com|www.oldboy.com ', ' www.oldboy.com ') ) print (ret) #结果: [' www.oldboy.com ']ret = Re.findall (' www. ( baidu|oldboy). com ', ' www.oldboy.com ') print (ret) #结果: [' oldboy ']ret = Re.findall (' www. (?: baidu|oldboy). com ', ' Www.oldboy.com ') print (ret) #结果: [' www.oldboy.com ']ret = Re.split (' \d+ ', ' alex83egon20taibai40 ') print (ret) #结果: [' Alex ', ' Egon ', ' Taibai ', ']ret = Re.split (' (\d+) ', ' alex83egon20taibai40 ') print (ret) #结果: [' Alex ', ' I ', ' Egon ', ' 20 ', ' Taibai ', ' + ', ']ret = Re.search (' \d+ (. \d+.\d+) (. \d+) ', ' 1.2.3.4-2* (60+ ( -40.35/5)-( -4*3)) ') print (Ret.group (0)) #结果 : 1.2.3.4 This group parameter 0 can omit print (Ret.group (1)) #结果:. 2.3print (Ret.group (2)) #结果:. 4
Group naming
(? p<name> regular expressions) for grouping names
(? P=name) Indicates the use of this grouping, where the match should be exactly the same as the content in the group
Import Reret = Re.search ("< (? p<name>\w+) > (? p<content>\w+) </(? P=name) > ","
Using grouping by indexImport Reret = Re.search (R "< (\w+) >\w+</(\w+) ><\1>\w+</\2>", "
Tool URL: Http://tool.chinaz.com/regex/?qq-pf-to=pcqq.group
Regular expressions and re modules in Python