The study of regular expressions and re modules in Python development

Last Update:2016-03-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Regular expressions are a great thing, whether in JavaScriptor in Python Web development (http://www.maiziedu.com/course/python-px/ , we all encounter regular expressions, although JavaScript and Python have little difference in regular expressions, but regular expressions are an essential part of Python, so let's talk about the re module in python today .

The RE module contains support for regular expressions.

What is a regular:
Regular expressions are patterns that can match text fragments.
Regular expression ' python ' can match ' python '

Wildcard characters
. indicates that any character matches:
'. Ython ' can match ' python ' and ' Fython '

To escape a special character:
' python\.org ' matches ' python.org '

Character
' [Pj]ython ' can match ' python ' and ' Jython '

Inverse Character Set
' [^ABC] ' can match any character except ABC

Selection character
Using pipe symbols |

Options available
Add hello to the option:
R ' (HTTP//)? (www.)? Python.org ' can only match the following types:

' Http://www.python.org '

' Http://python.org '

' Www.python.org '

' Python.org '

Repeating sub-mode
*: Allow mode to repeat 0 or more times
+: Allow mode to repeat 1 or more times
{m, n} Allow mode repetition M-n Times

Of course, there are many regular grammatical rules, much more than these. But we can only donuts, because the purpose of this blog is to introduce the module in Python , there module.

The RE module enables the Python language to have all the regular expression functionality.
The compile function generates a regular expression object based on a pattern string and an optional flag parameter. The object has a series of methods for regular expression matching and substitution.
The RE module also provides functions that are fully consistent with these methods, which use a pattern string as their first parameter.

important functions in Re:

Compile (pattern[, flags]) creates a pattern object based on a string containing a regular expression

Search (pattern, string[, flags]) searching for patterns in strings

Match (pattern, string[, flags]) matches the pattern at the beginning of the string split (pattern, string[, maxsplit=0]) splits the string based on the match

FindAll (Pattern, string) lists all occurrences of a pattern in a string sub (PAT, Rep, string[, count=0]) all Pat in string matches with the Repl Replace

Escape (String) escapes all special expression characters in a string

The following is a simple application:

Use match

Import re

Print (Re.match (' www ', ' www.runoob.com '). span ()) # matches at the starting position

Print (Re.match (' com ', ' www.runoob.com ')) # does not match at start position

Using search

Import reprint (Re.search (' www ', ' www.runoob.com '). span ()) # matches print at the starting position (re.search (' com ', ' www.runoob.com ' ). span ()) # does not match at the start position

What's the difference between match and search when you need to stop?
Look at the results first:

Results from the match example:

(0, 3) None

results from the search example:

(0, 3)

(11, 14)

The match () function only detects if the RE is matched at the start of the string , andsearch () scans the entire string find matches;
That is, Match () returns only if the match () is successful at 0 position, Match() If it is not a successful start position match will return none.

Search () scans the entire string and returns the first successful match.

Using Sub
The Python re module provides re.sub to replace matches in a string.

#!/usr/bin/pythonimport RE

Phone = "2004-959-559 # This is Phone number"

# Delete Python-style Comments

num = re.sub (R ' #.*$ ', "", phone) print "Phone num:", num

# Remove anything other than digits

num = re.sub (R ' \d ', "", phone) print "Phone num:", num

Results:

Phone num:2004-959-559

Phone num:2004959559

The last chrysanthemum:

^ matches the beginning of the string

$ matches the end of the string.

. matches any character, except for line breaks, when Re. When the Dotall tag is specified, it can match any character that includes a line feed.

[...] used to represent a set of characters , listed separately: [AMK] match ' A ', ' m ' or ' K '

[^...] not in Characters in [] :[^ABC] matches characters other than a,b,c.

re* matches 0 or more expressions.

re+ matches 1 or more expressions.

Re? matches 0 or 1 fragments defined by a preceding regular expression, not greedy

re{N}

re{N,} exactly matches n preceding expressions.

re{N, m} matches n to m times the fragment defined by the preceding regular expression, greedy way

a| b matches a or b

(RE) The G matches the expression in parentheses, and also represents a group

(? imx) The regular expression consists of three optional flags: I, M, or x . Affects only the areas in parentheses.

(?-imx) Regular Expression Close I, M, or x an optional flag. Affects only the areas in parentheses.

(?: RE) similar (...), but does not represent a group

(? imx:re) use in parentheses I, M, or x Optional Flag

(?-imx:re) do not use in parentheses I, M, or x Optional Flag

(?#...) Notes .

(? = re) forward positive qualifiers. If a regular expression is included, ... Indicates that a successful match at the current position succeeds or fails. But once the contained expression has been tried, the matching engine is not improved at all, and the remainder of the pattern attempts to the right of the delimiter.

(?! Re) forward negative qualifier. As opposed to a positive qualifier, when the containing expression cannot match the current position of the string

(?> re) match the standalone mode, eliminating backtracking.

\w Match Alpha-numeric

\w matches non-alphanumeric numbers

\s matches any whitespace character, equivalent to [\t\n\r\f].

\s matches any non-null character

\d matches any number, equivalent to [0-9].

\d matches any non-numeric

\a Match string start

\z matches the end of the string, if there is a newline, matches only the end string before the line break. C

\z Match string End

\g matches the position where the last match was completed.

\b matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.

\b matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.

\ n, \ t, et . matches a line break. Matches a tab character. Wait

\1...\9 matches the Sub-expression of the nth grouping.

\10 matches the sub-expression of nth grouping if it is matched. Otherwise, it refers to an expression of octal character code.

The above is the Python regular expression basic grammar and re module of the detailed content, I hope to help everyone.

The study of regular expressions and re modules in Python development

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The study of regular expressions and re modules in Python development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The study of regular expressions and re modules in Python development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support