As a modern language, regular expressions are essential, and in Python, regular expressions are located in the RE module.
1 Import RE
This does not say how the regular expression matches, such as \d for the number, ^ for the beginning (also for non, for example, ^a-z does not match any lowercase characters), the $ represents the end, these encyclopedia or other books have.
Example one, whether the string contains a number:
1 ImportRe2Userinput = input ("Please input test string:")3 ifRe.match (R'\d', Userinput):4 Print('contain number')5 Else:6 Print('no number in input string')
If the input does not contain a number, then the Re.match method returns none, and if it contains a number, it returns a match object.
Example two, split string:
1 Import Re 2 userinput = input ("pleaseinput test string:")3 temp = Re.split (R'\s+', Userinput)4print(temp)
\s represents any white space character (whitespace, tab, and so on), and the + sign represents 1 or more. Then the function of this code is to divide the characters by whitespace. For example, the string "a B DC" will get a list of [' A ', ' B ', ' DC ']. The normal string split function is difficult to do with this function.
Example three, grouping:
Sometimes we need to extract some parts of a string, such as a phone number, consisting of a three-bit or four-bit area code and a eight-bit phone number.
1 ImportRe2Userinput = input ("Please input test string:")3m = Re.match (r'(\d{3,4})-(\d{8})', Userinput)4 ifm:5 Print('Area code:'+ M.group (1))6 Print('Number:'+ M.group (2))7 Else:8 Print('Format Error')
Grouping uses (), which is the basic of regular expressions. M.group is counted from 0, and 0 is the input string.
Example four, greedy match:
1 ImportRe2Userinput = input ("Please input test string:")3m = Re.match (r'^ (\d+) (0*) $', Userinput)4 ifm:5 Print(M.groups ())6 Else:7 Print('Format Error')
Enter 102500 and we get (' 102500 ', ').
And the result we want is (' 1025 ', ' 00 '). You need to use a non-greedy match here. Because the regular expression in Python is using greedy mode by default (also in C #).
Modify the code as follows:
1 ImportRe2Userinput = input ("Please input test string:")3m = Re.match (r'^ (\d+?) (0*) $', Userinput)4 ifm:5 Print(M.groups ())6 Else:7 Print('Format Error')
That is, after the \d+, add a number. So the result will be the same as we think.
Note that non-greedy mode is less efficient than greedy mode, so do not use non-greedy mode until greedy mode is not matched.
Example five, regular expression precompilation:
Use the Re.compile method. When we need to use the same regular expression in many places, we should precompile the regular expression and then use the object returned directly by the method.
Python learning regular Expressions in -37.python