PYTHON__ Standard library: Regular Expressions (re)

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Re.match attempts to match a pattern from the starting position of the string, and if the match is not successful, match () returns none.

Re.search scans the entire string and returns the first successful match.

Replace:

Re . Sub(pattern, repl, string, count=0, Flags=0)

Parameters:

Pattern: The modal string in the regular.
REPL: The replacement string, or a function. (You can pass the result of each match to the function, the result is an object, there is a Gruop () method)
String: The original string to be looked up for replacement.
Count: The maximum number of times a pattern match is replaced, and the default of 0 means that all matches are replaced.

re.compile function

The compile function compiles a regular expression and generates a regular expression (Pattern) object for use by the match () and search () functions.

FindAll

Finds all substrings that match the regular expression in the string, returns a list, and returns an empty list if no match is found.

Re.finditer

Similar to FindAll, finds all substrings that match the regular expression in the string and returns them as an iterator.

Re.split

The Split method returns a list after splitting the string by a substring that can be matched

All methods are followed by a flags parameter such as Flags=re. S

Regular expression modifier-optional flag

A regular expression can contain some optional flag modifiers to control the pattern that is matched. The modifier is specified as an optional flag. Multiple flags can be specified by bitwise OR (|). such as Re. I | Re. M is set to the I and M flags:

modifier	Description
Re. I	Make the match case insensitive
Re. L	Do localization identification (locale-aware) matching
Re. M	Multiline match, affecting ^ and $
Re. S	Make. Match all characters, including line breaks
Re. U	Resolves characters based on the Unicode character set. This sign affects \w, \w, \b, \b.
Re. X	This flag is given by giving you a more flexible format so that you can write regular expressions much easier to understand.

Mode	Description
^	Matches the beginning of a string
$	Matches the end of the string.
.	Matches any character, except the newline character, when re. When the Dotall tag is specified, it can match any character that includes a line feed.
[...]	Used to represent a set of characters, listed separately: [AMK] matches ' a ', ' m ' or ' K '
[^...]	Characters not in []: [^ABC] matches characters other than a,b,c.
Tel	Matches 0 or more expressions.
Re+	Matches 1 or more expressions.
Re?	Matches 0 or 1 fragments defined by a preceding regular expression, not greedy
re{N}	Exact match n preceding expression. For example, o{2} cannot match "O" in "Bob", but can match two o in "food".
re{N,}	Matches n preceding expressions. For example, o{2,} cannot match "O" in "Bob", but can match all o in "Foooood". "O{1,}" is equivalent to "o+". "O{0,}" is equivalent to "o*".
re{N, m}	Matches N to M times the fragment defined by the preceding regular expression, greedy way
a\| B	Match A or B
(RE)	Matches an expression within parentheses, and also represents a group
(? imx)	The regular expression consists of three optional flags: I, M, or X. Affects only the areas in parentheses.
(?-imx)	The regular expression closes I, M, or x optional flag. Affects only the areas in parentheses.
(?: RE)	A similar (...), but does not represent a group
(? imx:re)	Use I, M, or x optional flag in parentheses
(?-imx:re)	I, M, or x optional flags are not used in parentheses
(?#...)	Comments.
(? = re)	Forward positive qualifiers. If a regular expression is included, ... Indicates that a successful match at the current position succeeds or fails. But once the contained expression has been tried, the matching engine is not improved at all, and the remainder of the pattern attempts to the right of the delimiter.
(?! Re)	Forward negative qualifier. As opposed to a positive qualifier, when the containing expression cannot match the current position of the string
(?> re)	Match the standalone mode, eliminating backtracking.
\w	Match alphanumeric and underline
\w	Match non-alphanumeric and underline
\s	Matches any whitespace character, equivalent to [\t\n\r\f].
\s	Match any non-null character
\d	Match any number, equivalent to [0-9].
\d	Match any non-numeric
\a	Match string start
\z	Matches the end of the string, if there is a newline, matches only the ending string before the line break.
\z	Match string End
\g	Matches the position where the last match was completed.
\b	Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.
\b	Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\ n, \ t, et.	Matches a line break. Matches a tab character. such as
\1...\9	Matches the contents of the nth grouping.
\10	Matches the contents of the nth grouping, if it is matched. Otherwise, it refers to an expression of octal character code.

Regular expression Instance character matching

Example	Description
Python	Match "Python".

Character class

Example	Description
[Pp]ython	Match "python" or "python"
Rub[ye]	Match "Ruby" or "Rube"
[Aeiou]	Match any one of the letters within the brackets
[0-9]	Match any number. Similar to [0123456789]
[A-z]	Match any lowercase letter
[A-z]	Match any uppercase letter
[A-za-z0-9]	Match any letters and numbers
[^aeiou]	All characters except the Aeiou letter
[^0-9]	Matches characters except for numbers

Special character Classes

Example	Description
.	Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
\d	Matches a numeric character. equivalent to [0-9].
\d	Matches a non-numeric character. equivalent to [^0-9].
\s	Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s	Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\w	Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w	Matches any non-word character. Equivalent to ' [^a-za-z0-9_] '.

　　　　　　　　　　　　Note that \w can match Chinese because he supports the Unicode character set by default.

PYTHON__ Standard library: Regular Expressions (re)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More