Basic regular expression syntax and re module in basic Python tutorial, basic python tutorial

Last Update:2016-03-27 Source: Internet

Author: User

Tags alphanumeric characters

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Basic regular expression syntax and re module in basic Python tutorial, basic python tutorial

What is a regular expression:

A regular expression is a pattern that can match a text clip.

The regular expression 'python' can match 'python'

Regular Expressions are awesome and are certainly not missing in python.

So today's Python will discuss with you the re module in python.

The re module supports regular expressions.

Wildcard

. Match any character:

'. Ython' can match 'python' and 'fython'

Escape special characters:

'Python \. org 'matches 'python. org'

Character Set

'[Pj] ython' can match 'python' and 'jython'

Reverse Character Set

'[^ Abc]' can match any character except abc

Selector

Use pipeline symbols |

Optional

When you add "hello", it becomes optional:

R' (http ://)? (Www .)? Python.org can only match the following types:

'Http: // www.python.org'
'Http: // python.org'
'Www .python.org'
'Python. org'

Replay Mode

*: The allowed mode is repeated 0 times or multiple times.
+: The allowed mode is repeated once or multiple times.
{M, n} allowed repeated m-n times

Of course, there are many regular expression syntax rules, far more than the above. However, we can only click here, because this blog aims to introduce the Python module and re module.

The re module enables the Python language to have all the regular expression functions.

The compile function generates a regular expression object based on a mode string and optional flag parameters. This object has a series of methods for regular expression matching and replacement.

The re module also provides functions that are exactly the same as those of these methods. These functions use a pattern string as their first parameter.

Important functions in re:

Compile (pattern [, flags]) creates a pattern object based on a string containing a regular expression.

Search (pattern, string [, flags]) in the string to find the Mode

Match (pattern, string [, flags]) matches the pattern at the beginning of the string

Split (pattern, string [, maxsplit = 0]) Splits strings Based on matching items

Findall (pattern, string) lists all matching items of the pattern in the string.

Replace all pat matching items in the sub (pat, rep, string [, count = 0]) string with repl

Escape (string) escapes all special expression characters in the string

The following is a simple application:

Use match

Import reprint (re. match ('www ', 'www .runoob.com '). span () # match print (re. match ('com ', 'www .runoob.com') # Not matching at the starting position

Use search

Import reprint (re. search ('www ', 'www .runoob.com '). span () # match print (re. search ('com ', 'www .runoob.com '). span () # does not match the start position

In this case, we need to stop. What is the difference between match and search?

Look at the results first:

Results In the match example:

(0, 3)
None

Results In the search example:

(0, 3)
(11, 14)

The match () function only checks whether the RE matches the start position of the string. search () scans the entire string for matching;
That is to say, match () is returned only when the match is successful at 0. If the match is not successful at the starting position, match () returns none.

Search () scans the entire string and returns the first successful match.

Use sub

The re module of Python provides re. sub to replace matching items in strings.

#!/usr/bin/pythonimport rephone = "2004-959-559 # This is Phone Number"# Delete Python-style commentsnum = re.sub(r'#.*$', "", phone)print "Phone Num : ", num# Remove anything other than digitsnum = re.sub(r'\D', "", phone) print "Phone Num : ", num

Result:

Phone Num: 2004-959-559
Phone Num: 2004959559

Final chrysanthemum:

^ Match the start of a string
$ Matches the end of a string.
. Match any character. Except for line breaks, when re. DOTALL is specified, it can match any character including line breaks.
[...] Indicates a group of characters, which are listed separately: [amk] matches 'A', 'M', or 'K'
[^...] Characters not in []: [^ abc] matches characters other than a, B, and c.
Re * matches zero or multiple expressions.
Re + matches one or more expressions.
Re? Matches 0 or 1 segment defined by the previous regular expression. It is not greedy.
Re {n}
Re {n,} exactly matches n previous expressions.
Re {n, m} matches the segments defined by the previous regular expression for n to m times. Greedy Mode
A | B matches a or B
(Re) G matches the expression in the brackets and also represents a group
(? Imx) a regular expression contains three optional flags: I, m, or x. Only the area in the brackets is affected.
(? -Imx) the regular expression disables the I, m, or x flag. Only the area in the brackets is affected.
(? : Re) similar to (...), but does not represent a group
(? Imx: re) use the I, m, or x flag in brackets.
(? -Imx: re) do not use I, m, or x optional flag in brackets
(? #...) Comment.
(? = Re) forward positive identifier. If the regular expression is included in the regular expression, it indicates that the match is successful at the current position. Otherwise, the match fails. However, once the contained expression has been tried, the matching engine has not improved at all; the rest of the pattern also needs to try to the right of the separator.
(?! Re. Opposite to the positive identifier. The expression contained in the string cannot match the current position of the string.
(?> Re) matching independent mode, eliminating backtracking.
\ W matching letters and numbers
\ W matches non-alphanumeric characters
\ S matches any blank characters, which is equivalent to [\ t \ n \ r \ f].
\ S match any non-null characters
\ D matches any number, which is equivalent to [0-9].
\ D match any non-digit
\ A matches strings
\ Z matches the end of a string. If a line break exists, it only matches the end string before the line break. C
\ Z match string ends
The position where \ G matches the final match.
\ B matches a word boundary, that is, the position between a word and a space. For example, 'er \ B 'can match 'er' in "never", but cannot match 'er 'in "verb '.
\ B matches non-word boundaries. 'Er \ B 'can match 'er' in "verb", but cannot match 'er 'in "never '.
\ N, \ t, and so on. match a line break. Match a tab. And so on
\ 1... \ 9 matches the subexpression of the nth group.
\ 10 matches the subexpression of the nth group if it matches. Otherwise, it refers to the expression of the octal verification code.

Re' regular expression syntax

The regular expression syntax is as follows:

Syntax	Meaning	Description
"."	Any character
"^"	String start	'^ Hello' matches 'helloworld' but does not match 'aaaahellobb'
"$"	End of string	Same as above
"*"	0 or multiple characters (Greedy match)	<*> Match
"+"	1 or more characters (Greedy match)	Same as above
"? "	0 or multiple characters (Greedy match)	Same as above
*?, + ?,??	The above three get the first matching result (non-Greedy match)	<*> Match

Articles you may be interested in:

PYTHON Regular Expression re Module Instructions
The Python module learns the re regular expression.
Common methods for the remodule of python Regular Expression
Python Regular Expression re module details
Python Regular Expression re module details
Python re Regular Expression module (Regular Expression)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More