One, regular expression
1, online test tool: http://tool.chinaz.com/regex/
Character groups: in regular expressions [], in Python is a list
The simple regular expression [] in the number is 0-9,a-z,a-z match only a number, the simple only know the numbers, letters, characters, no complex numbers, fractional chaos, the number as an example of shorthand can only be 0-9, can not be 9-0
Group of characters representing numbers: [13466872],[0123456789], abbreviated [0-9],[2-8]
[0-9a-za-z] contains, among other words, only a number, a letter range, only one match at a time.
Simple example:
[0123456789]------8----------True character group 0-9, to match the number 8, match the result True, out of character groups such as a does not match
[0-9]---------7---------True with the same meaning as in the above example, different representations.
[A-z]---------s-------True to match all lowercase letters
[A-z]----------B--------True to match all uppercase letters
[0-9a-za-z]--------8,a,s-------True to match only one at a time (remember to follow the range)
2, Characters:
Metacharacters and matching content
. (dot)-----------match any character except line break
\w-----------Match letters or numbers or underscores
\s-----------matches any whitespace character, space
\d-----------Match any number
\w----------Match Non-alphabetic or numeric, underlined (usually symbols)
\d----------Match Non-numeric
\s---------match non-whitespace characters
\ n-----------match a line break
\ t-----------match a tab (tab)
\b----------Match the end of a word (the word ends with a match)
^-----------The start of the matching string (starting with what can be matched) put it at the beginning
$-----------matches the end of the string (the string ends with what), followed by the string
(\b differs from $) \b must be in front of the match at the end of the add r,$ not be able to match. If there are multiple identical occurrences in a string or word and a space-delimited word in the middle, \b will match, and $ will only match the last one.
A | b-----------match character A or character B
()-----------match the expression in parentheses
[.......] ----------match characters in a character group
[^.....] ----------match all characters except characters in a character group
3, quantifier
*----------repeat 0 or more times
+----------Repeat one or more times
? ----------repeat once or 0 times
{n}---------repeat n times
{n,}----------repeats n or more times
{n,m}----------repeat N to M times
4,. ^ $ example
Sea. ------------Haidong, Haiyan, sea pepper, sea-----------match a character that has a sea character and is immediately followed by a sea word, and nothing else matches.
^ ha-----------match the first word only from the beginning, and the extra characters do not match
Five $------------Match only the end five words, match only one, except five does not match
5, greedy match * +? As many matches as possible.
Regular matching string match result description
Li.? ------------Li Jie Buddy and Li two sticks-----------Li Jie, Li Lian, Li two------------? Represents a repeat 0 or one time only matches a single character that contains the Li character only after the word is taken
Lee. *------------Li Jie Buddy and Li two sticks------------Li Jie buddy and Li two sticks----------* means repeat 0 or more times, that is, match (contains Lee) Lee behind the 0 or any number of characters (greedy match)
Li. +-----------Li Jie Buddy and Li two sticks------------Li Jie buddy and Li two sticks----------+ means repeat one or more times, that is, match (contains Lee) Lee behind the 1 or any number of characters (greedy match)
Lee {Li Jie}-----------Buddy and Li two sticks------------Li Jie Li Li sticks-----------{1} matches any character (including Li)
Note: *? The combination becomes an inert match
Lee. *? ------------Li Jie buddy Lee stick----------Li Li------------Lazy Match
6, character set [] [^]
Regular matching character matching result description
Li [Jie Guang dog] Li Jie and Li Guangning and Lee Dog Li Jie, Li Guang, Li Gozheng and matching characters to match any time
Lee [^ and]* Li Jie and Buddy and Lee two sticks Li Jie, buddy, Lee two sticks match a character that is not ' and ' any time
[\d] 456ABC3 4,5,6,3 Match a single number
[\d]+ 456abc123 456,123 matches Any number (joined together)
7, grouping () with or | [^]
Regular matching character matching result description
^[1-9]\d{13,16}[0-9x]$ 110101198001017032 110101198001017032 match a correct ID number
8, escape character \
Metacharacters: \d, \s
Regular matching character matching result description
\d \d False special character, cannot match itself
\\d \d True transfer \ then change to \ \ to match
' \\\\d ' \\d ' True python in the string ' \ ' escapes each string ' \ ' to be escaped
R ' \\d ' r ' \d ' True before string plus r, the entire string is not escaped
9, greedy match: matches as many strings as possible, with greedy matching by default
Regular matching character matching result description
<.*> <script>.....<script> <script>.....<script> Default Greed Match pattern, match as much as possible
<.*?> R ' \d ' <script>, <script> plus? Greedy match will become lazy match
10, a common fee greedy match
*? Repeat any number of times, but with as few repetitions as possible +? Repeat 1 or more times, but repeat as little as possible?? Repeat 0 or 1 times, but repeat {n,m} as little as possible. Repeat N to M times, but repeat {n,} as little as possible. Repeat more than n times, but repeat as little as possible
11,. Use of *?
. Is any character * to take 0 to infinity length? Non-greedy mode. Where together is to take as little as possible any character, generally not so alone, he mostly used in:. *?x is to take a character of any length before the x appears
Second, the common module (Python module needs to be imported under import keyword)
1,re Module
Python common modules and regular expressions