Python3 Regular Expression (regex)

Source: Internet
Author: User
Tags alphanumeric characters


Regular expressions provide a compact notation that can be used to represent a combination of strings, and a single regular expression can represent an infinite number of strings. 5 Common uses: analysis, search, search and Replace, string segmentation, validation.


(i) Regular expression language
Special characters in Python are \.^$?+*{}[] () |
1. Shorthand for character class
^ If the first character in a character class represents negation;
-Represents a range of characters that, if the first character in a character class, represents a literal hyphen;
. You can match any character except the line break, or with re. Dotall any character that is marked, or character that matches the literal meaning inside a character class;
\d matches a Unicode number, or with re. ASCII-tagged [0-9];
\d matches a Unicode non-numeric, or with re. ASCII tagged [^0-9];
\s matches Unicode whitespace, or with re. ASCII tagged [\t\n\r\f\v];
\s matches Unicode non-whitespace, or with re. ASCII tagged [^\t\n\r\f\v];
\w matches a Unicode word character, or with re. ASCII tagged [a-za-z0-9_];
\w matches a Unicode non-word character, or with re. ASCII tagged [^a-za-z0-9_]


2. quantifier
The format {m,n},m and n represent the minimum and maximum number of times that the expression using the quantifier must match. If only one number is given, the minimum maximum value is also represented
quantifier Shorthand form:
E{m} exactly matches the m occurrence of an expression E
E{m,} Greedy match expression E at least M-times appear
E{m,}? Non-greedy match expression E at least m occurrences
E{,n} up to n occurrences
E{,n}? Greed
E? e{0,1}
E?? e{0,1}?
E+ E{1,}
E+? E{1,}?
E* e{0,}
E*? E{0,}?
[] matches any one of the contents in []
() to match the contents of () as a whole
Greed means that you will be able to match the qualifying characters as many as possible, while non-greedy matches as few as possible.


3. Air (Craft|plane) can be used to match aircraft and airplane.
Using Air (?: Craft|plane) can be used to limit aricraft and airplane to only one capture when in more nested nesting. parentheses represent groups.


4, \i Reverse Reference, I represents the previous capture number. The capture number can also be used in front of the left parenthesis plus? P<name> to replace it with a name
such as: (? p<key>\w+) = (? p<value>.+) A reverse reference to the capture (? P=name): (? p<word>\w+) \s+ (? P=word) You can use a capture named "word" to match duplicate words.


5. Regular expression Assertion:
^ matches at the beginning, or after each line break with the multiline tag;
$ matches at the end, or in front of each line break with multiline tags;
The \a matches at the starting point;
\b matches the word boundary and is subject to re. The ASCII effect, if inside the character, is the escape character of the backspace;
\b Matches a non-word boundary, subject to re. ASCII influence;
\z match at the end;
(? =e) If the expression E is matched at this assertion, but not beyond--called forward-looking or forward-looking, then matched;
(?! e) If the expression e does not match at this assertion, but is not exceeded here-called negative forward, then matched;
(? <=e) If the expression E is matched before this assertion--called a positive review;
(? <!e) If the expression e does not match before this assertion--called negative review, then match;


6. Annotations of regular expressions
Available (? #the comment), or re. Verbose tag.


(ii) Regular expression module
Regular Expression Module Tags:
Re. A or re. Ascii
Re. I or re. IGNORECASE ignoring case
Re. M or re. MULTILINE causes ^ to match at the beginning and after each newline character, so that it matches at the end but before each line break
Re. S or re. Dotall make. Match each character, including line breaks
Re. X or re. VERBOSE to include whitespace with comments in a match

function of the regular expression module (for investigation):
Re.compile (R,F) returns the compiled regular expression r, and if specified, sets its tag to f (that is, the top of Re.) A, and can have multiple tags at the same time, separated by | (using the RE ' regex ' form to express the string can not be escaped)
Re.escape (s) returns the string s, where all non-alphanumeric characters are escaped with a backslash, so there is no special regular expression character in the returned string;
Re.findall (R,S,F) returns the regular expression R for all non-overlapping matches in the string s (if given F, it is constrained). If there is a capture in the regular expression, then each match is returned as a capturing tuple;
Re.finditer (r,s,f) Returns a matching object for each non-overlapping match of the regular expression r in the string s (if it is constrained by the given F);
Re.match (R,S,F) if the regular expression R matches at the beginning of the string s (if given F, it is constrained), returns a matching object (Matchobject), otherwise none is returned;
Re.search (R,S,F) if the regular expression R matches anywhere in the string s (if given F, it is constrained), it returns a matching object, otherwise none is returned;
Re.split (r,s,m) returns a list of strings produced by the split string s (split at each occurrence of the regular expression R), at most divided by M (if not given m, as much as possible), and if the regular expression contains a capture, it is included in the split part;
Re.sub (r,x,s,m) each match of the regular expression r (if given m, then up to M), returns a copy of the string s and replaces it with x--, which can be a string or a function;
RE.SUBN (r,x,s,m) is the same as the Re.sub () function, except that the function returns a two-tuple;

Match object properties and methods:
M.end (g) Returns the group G (if given) at the end index position of the text match, which is the overall match for group 0, or 1 if the match does not contain the group;
M.endpos Search's End index location
M.expands (s) returns the string s and captures the identity with the corresponding capture substitution;
M.group (g,...) Returns the numbered or named Group G, if given more than one, returns the corresponding capturing constituent tuple;
M.groupdict (Difault) returns a dictionary that holds all named capturing groups, group names as Keys, and captures as values, and if given the default parameter, it is used as a value for those capturing groups that do not participate in the match;
M.groups (default) returns a tuple that contains all the captures, starting at 1, and if given default, it is used as the value of the capturing group that does not participate in the match;
M.lastgroup the name of the matching, highest-numbered capturing group, and returns none if it does not exist or does not use a name;
M.lastindex matches the number of the highest-numbered capturing group, and returns none if not;
The starting index position of the M.pos search;
M.re a regular Expression object that produces this matching object;
M.span (g) If a given G, returns the starting index position and end position of the group G in the text, (for group 0, the whole match), or if the group does not participate in the match, returns ( -1,-1);
M.start (g) If given G, returns the starting index position of the group G in the text (for group 0, the whole match), or 1 if the group does not participate in the match;
M.string the string passed to match () or search ();

Python3 Regular Expression (regex)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.