Python3 Regular Expression and python3 Regular Expression
1 ''' 2 Regular Expression, also known as a rule expression, is often abbreviated as regex, regexp, or RE in code. It is a concept of computer science. 3. Regular tables are usually used to retrieve and replace texts that conform to a certain pattern (rule. 4 Regular Expressions are a logical formula used to operate strings (including common characters (for example, letters between a and z) and special characters (referred to as "metacharacters, 5 is to use a predefined combination of specific characters to form a "rule string". This "rule string" is used to express a filtering logic for strings. 6. A regular expression is a text pattern that describes one or more strings to be matched during text search. 7 '''8 9''' 10 first, you must be clear that the method provided by the string is to completely match the 11 Regular Expression and fuzzy match is provided for us, call 12''' 13 # a = 'hobbyer is a student through the re module! '14 # print (. find ('B') # find 15 # chg =. replace ('is', 'are') # replace element 16 # print (chg) 17 # print (. split ('U') # split 18 ''' 19 blog Infi_chu 20 ''' 21 22 import re # Call re Module 23 # print (re. findall ('B \ B {2} l', 'hobbyer is a student! ') # Findall (rule, String, modify matching rule), fully match with 24 25''' 26 Regular metacharacters (11): 27. ^ $ * +? {} [] | () \ 28 ''' 29 ##. is a wildcard 30 # a1 = 'Hello world' 31 # print (re. findall ('W .. l', a1 ))#. only one character can be matched. If one character is specified, 32 # print (re. findall ('W .. l ', 'W \ norld '))#. line breaks cannot be matched. Except line breaks, a 33 # print (re. split ('[a, c]', 'ddabcffadacd') # There is an ordered split, so there is an ac in the middle, so 34 # print (re. sub ('E. l ', 'wc', a1) # Replace 35 #36 # a3 = re. compile ('\. com ') # multiple use methods to improve efficiency 37 # a3_out = a3.findall ('www .example.com.cn') 38 # print (A3_out) 39 ''' 40 Infi_chu 41 ''' 42 ##^ 43 # print (re. findall ('W... d', a1) 44 # print (re. findall ('^ w... d ', a1) # match from the beginning. If yes, if no, it will not match 45 #46 ##$ 47 # print (re. findall ('H... o $ ', a1) # Only matches 48 #49 ### * repeats the preceding character at the end, which is greater than or equal to 0 times 50 # print (re. findall ('B. * U', 'HTTP: // www.baidu.com ') # repeat the match. One * can be used for multiple points to indicate 51 # print (re. findall ('. * ', a) # ''after the output result. It is another case (null, matching 0) 52 ''' 53 Infi_chu 54 ''' 55 ##+ Repeated match, greater than or equal to 1 time 56 # print (re. findall ('. + ', a) # If the output result is required, no null occurs. 57 #58 ##? Matching range: zero or one 59 # print (re. findall ('W? R', 'wwrr ') # match 0-1 characters 60 #61 ##{} any number of times 62 # print (re. findall ('W {5} R', 'wwwrrwwwrr') # match 5 w and 1 r 63 # print (re. findall ('W {1, 5} R', 'wwwrrwwwrrwr ')) # match 1-5 times 64 ''' 65 blog Park Infi_chu 66 ''' 67 # [] is a character set 68 # print (re. findall ('W [c, e] R', 'Wer ') # You can add any character or string to []. Select 1 69 # print (re. findall ('W [a-z] ', 'wff') # The value range is a-z 70 # print (re. findall ('[a, *]', 'ww ') # * It is not a previous function. In [], it is only a common * number, but (\ ^ -) exception 71 # p Rint (re. findall ('[a-z, 0-9, A-Z]', 'afsasdsfa54ass ') 72 # print (re. findall ('[^ c]', 'acs') # ^ in [], it indicates reverse. 73 # print (re. findall ('[^ a, B]', 'abcdefg ')) # non-a non-B 74 ''' 75 Infi_chu 76''' 77 # \ 78 ''' 79 backslash followed by metacharacters to remove special features, 80 backslash followed by common characters, make it have special functions 81 \ d match decimal number 82 \ D match any non-digit character 83 \ s match any blank character 84 \ S match any non-blank character 85 \ w match any letter character 86 \ W match any non-letter character 87 \ B match the location between a word and a space 88 ''' 89 # print (re. findall ('\ d {2}', 'asdw5d31a Sdw1a3d5s48w4d3a1w') # match the double-digit 90 # print (re. findall ('\ swww', 'fg www ') # match the blank character 91 # print (re. findall (r's \ B ','s is a s $ udent') # \ B matches the special character 92 # print (re. findall (r's \ B ','s is a student') 93 # c1 = re. search ('wc ', 'wsdwcasdwcaff') # search only matches the first 94 found # print (c1) 95 # print (c1.group () # directly outputs the Matching content, however, if search does not match successfully, the call group will report 96 # print (re. findall ('\\\\ C', 'afekk: \ C') # note the escape in the Python interpreter. After being handed over to the re module, there is an escape. So four 97 # print (re. search (r '\ bbasd', 'basd') # r indicates the native string and does not need to escape 98 ''' 99 blog infi_chu100''' 101 ##() group 102 # print (re. search ('(wc) +', 'fswfefwcdwc ') # group 103 # print (re. search ('(wc) +', 'fswfefwwc ') 104 ''' 105 Infi_chu106 ''' 107 ##| or 108 # print (re. search ('(wc) | ff', 'ffwca') # Or 109 110 ''' 111 advanced usage 112 ''' 113 # print (re. search ('(? P <id> \ d {3 })/(? P <name> \ w {3}) ', 'weeew34ttt123/ooo '))#? P <> name format: Put the name in <>, followed by Rule 114 # print (re. search ('(? P <id> \ d {3 })/(? P <name> \ w {3}) ', 'weeew34ttt123/ooo'). group () 115 # print (re. search ('(? P <id> \ d {3 })/(? P <name> \ w {3}) ', 'weeew34ttt123/ooo'). group ('id') 116 # print (re. search ('(? P <id> \ d {3 })/(? P <name> \ w {3}) ', 'weeew34ttt123/ooo '). group ('name') 117 118 ''' 119 method summary 120 regular expression method: 121 findall (): returns all results, 122 search (): returns an object and returns the first matched object. The object can call the group () method to return the Matching content 123 match (): match only when the string starts. The first matching object 124 split (): split point 125 sub (): replace 126 compile (): rule 127 ''' 128 129 130 ''' 131 Infi_chu132 '''