Reference document: Python regular expressions
Regular expression definitions: Regular is a highly specialized programming language that is embedded in other languages (Python re modules). The regular expression contains a list of metacharacters (metacharacter), with the following list:. ^ $ * + ? { [ ] \ | (), these metacharacters can only work in the right place .
1. [] To specify a collection we want the string, the string can be listed separately, or through the "-" connection to indicate the scope, such as [ABC] match ABC in how an element, can be represented by [a-c].
2. [^] You can use a complement to match a character that does not exist in this space, by using "^" as the first character of the category, and "^" in other places simply matching the "^" character itself. For example
in []: M = re.search ( ^ab+ ", " asdfabbbb " " in []: print Mnonein [ PNs]: M = re.search ( ^ab+ , " ABSDFABBBB ) in []: print M <_sre. Sre_match object at 0x7f8bc2466c60>in []: M.group () ab
The effect is like using the Re.match () function,
in [+]: m2 = re.match ("ab+""absdfabbbb") in [ Print m2<_sre. Sre_match object at 0x7f8bc2466e68> in[print m2.group () AB
In [Wuyi]: m2 = re.match ("ab+""absdfabbbb\nabcdefghijklmn" , Re. MULTILINE) in [print m2.group () ab
To summarize the match and search functions, both are found to return matching results, do not continue to find, if you need to find all the rows, then you need to call Re.findall ()
in [+]: m2 = re.findall ("ab+","absdfabbbb\nabcdefghijklmn" ) in [print re.fire.findall re.finditer in [print m2[ ' AB ' ' abbbb ' ' AB ']
3. Meta-character (\), meta-character backslash. The "\" followed by a backslash indicates a special meaning. It can be used to cancel metacharacters, so that metacharacters are ordinary characters.
4. The meta-character ($) matches the end of the string or the end of the string before the line break. (in multiline mode, "$" also matches the line-break) The regular expression "foo" matches both "foo" and "Foobar", while "foo$" matches only "foo"
5. Metacharacters (*), matching 0 or more
6. Meta-character (?), matching one or 0
7. Meta-character (+), matching one or more
8, meta-character (|), indicating "or", such as a| b, where A-B is a regular expression that matches a
9. Meta-characters ({})
{m}, which represents the M-copy of the preceding regular expression, such as "a{5}", which indicates a match of 5 "a", or "AAAAA"
Regular grammar note-regular expression note