Python Regular Expressions

Last Update:2017-08-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

There are two basic operations for regular expressions, namely, matching and substitution.

Matching is the search for a particular expression in a text string;

The substitution is to find and replace a string in a string that matches a particular expression.
1. Basic elements
A regular expression defines a series of special character elements to perform a matching action.

Regular Expression Basic character

character	Description
Text	Match text string
.	Match any single character except for a line break
^	Matches the beginning of a string
$	Matches the end of a string

In regular expressions, we can also use match qualifiers to constrain the number of matches.
Match qualifier

Maximum Match	Minimum Match	Description
*	*	Repeat match before expression 0 or more times
+	+	Repeat match before expression one or more times
		Repeat match before expression 0 or one time
{m}	{m}	Exact repetition of the pre-expression m times
{m,}	{m,}	At least repeat the pre-expression m times
{M,n}	{M,n}	At least repeat the previous expression m times, at most repeat match before expression n times

According to the above, ". *" is the maximum match, can match all the strings that can match the source string. ". *" is the minimum match, matching only the first occurrence of the string. For example: D.*g can match any string starting with D, ending with G, such as "Debug" and "debugging", or even "dog is walking". and d.* G can only match "debug", in "Dog is Walking" string, only match to "dog".

In some more complex matches, we are available to groups and operators.
Groups and operators

Group	Description
[...]	Matches a character within a set, such as [a-z],[1-9] or [,./; ']
[^...]	Matches all characters except the set, which is the equivalent of taking the inverse action
a\| B	Match expression A or B, equivalent to an OR operation
(...)	Expressions are grouped, each pair of parentheses is a group, such as ([a-b]+) ([a-z]+) ([1-9]+)]
\number	Match text within the number expression group

There is a special set of character sequences that are used to match specific character types or character contexts. such as \b matches the character boundary, food\b matches "food", "Zoofood", and "foodies" does not match.
Special character Sequences

character	Description
\a	Match only the beginning of a string
\b	Match a word boundary
\b	Matches the non-boundary of a word
\d	matches any decimal digit character, equivalent to R ' [0-9] '
\d	Matches any non-decimal numeric character equivalent to R ' [^0-9] '
\s	Match any empty characters (space, tab tab, line feed, carriage return, page break, vertical line symbol)
\s	Match any non-whitespace character
\w	Match any alphanumeric character
\w	Match any non-alphanumeric character
\z	Match only the tail of a string
\\	Match backslash character

A set of statements (assertion) declares a specific event.
Regular expression declarations

The The

declaration	Description
(ILMSUX) The	matches the empty string, and the Ilmsux character corresponds to the regular expression modifier for the following table.
(: ...)	matches the expression defined within the parentheses, but does not populate the character Group table.
(p<name>)	matches the expression defined within parentheses, but the matching expression can also be used as a symbol group for name identification.
(p=name)	matches all text that matches the previously named group of characters.
(# ...)	introduces comments, ignoring the contents within parentheses.
(= ...)	if the provided text matches the next regular expression element, there is no extra text to match. This allows for advanced operations in an expression without affecting the analysis of the rest of the regular expression. If "Martin" followed by "Brown", then "Martin" =brown only with "Martin" match.
(!...)	matches only if the specified expression does not match the next regular expression element, yes (= ...) The inverse of the operation.
(<= ...)	if the prefix string for the current position of the string is the given text, the entire expression is terminated at the current position. such as the (<=ABC) def expression matches "abcdef". This match is an exact match for the number of prefix characters.
(<!...)	if the prefix string for the current position of the string is not the given body, it matches, yes (<= ...) The inverse of the operation.

Regular expressions also support some processing flags, which can affect the execution of a regular method.
Handling Flags

logo	Description
I or ignorecase	Ignores the case of an expression to match the text.

2. Operation

With the RE module, we can search, extract, and replace strings in Python using regular expressions. For example, the Re.search () function can perform a basic search operation, and it can return a Matchobject object. The Re.findall () function can return a matching list.

The code is as follows:

>>> Import re
>>> a= "This is my re module test"
>>> obj = Re.search (R '. *is ', a)
>>> Print obj
< _sre. Sre_match Object at 0xb7d7a218>
>>> Obj.group ()
' This is '
>>> Re.findall (R '. *is ', a)
[' This is ']

Matchobject Object Methods

Method	Description
Expand (Template)	Expands the content defined in the template with backslashes.
M.group ([group,...])	Returns the matched text, which is a tuple. This text is the text that matches the group defined by the given group or by its index number, and all occurrences are returned if there is no group-specific group name.
M.groups ([default])	Returns a tuple that contains the text in the pattern that matches all groups. If the default parameter is given, the default parameter value is the return value of the group that does not match the given expression. The default parameter has a value of none.
M.groupdict ([default])	Returns a dictionary that contains all child groups that match. If the default parameter is given, its value is the return value of those mismatched groups. The default parameter has a value of none.
M.start ([group])	Returns the start position of the specified group, or returns the start position of all matches.
M.end ([group])	Returns the end position of the specified group, or returns the end position of all matches.
M.span ([group])	Returns a two element group that is equivalent to a list of (M.start (group), M.end (group)) for a given group or a complete match expression
M.pos	The POS value passed to the match () or the search () function.
M.endpos	The Endpos value passed to the match () or the search () function.
M.lastindex
M.lastgroup
M.re	Create a regular object for this Matchobject object
M.string	A string supplied to the match () or the search () function.

Use the sub () or SUBN () function to perform a substitution operation on a string. The basic lattice R of the sub () function is as follows:

Sub (Pattern,replace,string[,count])
Example

The code is as follows:

>>> str = ' The Dog on my Bed '
>>> rep = re.sub (' dog ', ' cat ', str)
>>> Print Rep
The Cat on my Bed

The Replace parameter can accept the function. You can use the SUBN () function to get the number of replacements. The SUBN () function returns a tuple that contains the substituted text and the number of substitutions.

If we need to do multiple matches with the same regular, we can compile the regular form into internal language and improve the processing speed. The compiled regular is implemented using the compile () function. The basic format of the compile () function is as follows: Compile (Str[,flags])
STR indicates a regular string to compile, and flags is a modifier marker. The regular form is compiled into an object that has several methods and properties.
Regular-Object Methods/Properties

Method/Property	Description
R.search (String[,pos[,endpos])	Same as the search () function, but this function allows you to specify the start and end of the search
R.match (String[,pos[,endpos])	With the match () function, but this function allows you to specify the start and end of the search
R.split (String[,max])	The same split () function
R.findall (String)	Same FindAll () function
R.sub (Replace,string[,count])	Same sub () function
R.SUBN (Replace,string[,count])	Same subn () function
R.flags	Flags defined when creating an object
R.groupindex	Map the name of the symbol group defined by R ' (Pid) to the dictionary of the Group ordinal
R.pattern	The mode used when creating the object

Escape string with the Re.escape () function.
Getting object references through GetAttr

The code is as follows:

>>> li=[' A ', ' B ']
>>> GetAttr (li, ' append ')
>>> GetAttr (li, ' append ') (' C ') #相当于li. Append (' C ')
>>> Li
[' A ', ' B ', ' C ']
>>> handler=getattr (li, ' append ', None)
>>> Handler
< built-in method Append of list object at 0xb7d4a52c>
>>> handler (' cc ') #相当于li. Append (' cc ')
>>> Li
[' A ', ' B ', ' C ', ' CC ']
>>> result = Handler (' BB ')
>>> Li
[' A ', ' B ', ' C ', ' cc ', ' BB ']
>>> Print Result
None

This article is from the "Big Plum" blog, make sure to keep this source http://n1lixing.blog.51cto.com/11772222/1954242

Python Regular Expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support