Regular expressions of Python learning notes

Last Update:2017-10-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tag: Returns the object www. Flags Modeling Tail Center NET Abort

Regular expression: The pattern that matches the text fragment.

Wildcard character: matches more than one string. such as '. ' You can match all characters except the newline character, only one.
Escape of special characters: if you want to match the string ' python.org ', if you use ' python.org ' directly to match, it will not only match to ' python.org ', but also ' pythoniorg ' and other strings, at this time need to '. ' To be escaped, use ' python\\.org ' or R ' python\.org ' to match.
Character set: Use brackets to create character sets, including all the characters you want to match, such as ' [Pj]ython ' can match to ' python ' and ' Jython ', ' [a-za-z0-9] ' can match any uppercase and lowercase letters and numbers, and the character set can only match one character. The inverse character set, such as ' [^ABC] ', means that all characters except A, B, and C can be matched.
Selectors and sub-patterns: pipe symbol ' | ', used for selection, such as matching only ' Python ' and ' Jython ', using ' python|jython ' for matching, or just using selection operators for part of the pattern, such as ' (p|j) Ython ', using parentheses to enclose the desired part, Called Sub-mode,
Optional and repeating sub-modes: Adding a question mark after the sub-mode becomes optional, such as R ' (http://)? (www\.)? Python\.org ' can be matched to ' http://www. ', ' http: ', ' www. ' and ' python.org ' strings. A question mark indicates that the child mode is allowed to appear 0 or one time.

(pattern) *: Allow sub-mode to appear 0 or more times

(pattern) +: Allow sub-mode to appear 1 or more times

(pattern) {m,n}: Allow mode to repeat M to n times

The beginning and end of the string: the previous match is for the entire string, if you want to match the beginning or end of the string, you need to use the ' ^ ' tag, such as ' ^ht+p ' only the ' ht+p ' character that matches the string, and the ' $ ' identifier used to match the end of the qualifying string .

Common functions of the RE module

Function	Describe
Compile (Pattern[,flags])	Create a Pattern object based on a string containing a regular expression
Search (Pattern,string[,flags])	Finding patterns in strings
Match (Pattern,string[,flags])	Match pattern at start of string
Split (pattern,string[,maxsplit = 0])	To split a string based on a pattern match
FindAll (pattern,string)	List all occurrences of a pattern in a string
Sub (pat,repl,string[,count = 0])	Replaces all Pat matches in a string with REPL
Escape (String)	Escapes all special regular expression characters in a string

For the matching function in the RE module, the match succeeds by returning the Matchobject object, which includes the substring information of the matching pattern, and which pattern matches which part of the information, these "parts" are called groups, and the group is the sub-pattern placed in the tuple brackets.

The mode ' There (is a (wee) (Cooper)) who (lived in Fyfe) ' contains the following groups:

0 There is a wee Cooper who lived in Fyfe

1 was a wee Cooper

2 Wee

3 Cooper

4 lived in Fyfe

An important way to match objects with re:

Method	Describe
Group ([group1,......])	Get a match for a given sub-pattern (group)
Start ([group])	Returns the starting position of a match for a given group
End ([group])	Returns the end position of a match for a given group
span ([group])	Returns the start and end positions of a group

 >>> import   re  >>> m = Re.match ( " www\. *)\.. {3}   ", "  www.python.org   " )  >>> G1 = M.group (1)  >>> M.group (1  " python   " >>> m.end (1 10>>> M.span (1 4, 10 >>> M.group (0)   "  www.python.org   "

Add a '? ' after the repeat operator It turns the repetition operation into a non-greedy version.

re.split() function when cutting, if the pattern contains parentheses, the contents of the parentheses will exist between each substring.

Re.split (pattern,string[,maxsplit = 0])

The split function also has a parameter that limits the number of splits Maxsplit

The Re.findall function Returns all occurrences of a given pattern in the form of a list.

Re.findall (PATTERN,STR)

The re.sub () function replaces the leftmost and non-overlapping substrings with the specified content.

Re.sub (pat,repi,str[,count = 0])

The sub () function can be replaced by a group, and any escape sequence that occurs using ' \\n ' form in the replacement will be replaced with a string that matches the group N in the pattern.

For example, replace *something* in text with <em>something</em>

>>> Pat = r'\*([^\*]+)\*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Pat =re.compile (R'\*([^\*]+)\*')>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'

The repeating operator is greedy, and it makes as many matches as possible.

>>> Pat = r'\*(.+)\*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Re.sub (pat,r'<em>\1</em>','*hello* *world*!')'<em>hello* *world</em>!'

In this case, you need to use non-greedy mode, that is, add a '? ' after the repeat match.

>>> Pat = r'\*(.+?) \*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Re.sub (pat,r'<em>\1</em>','*hello* *world*!')'<em>hello</em> <em>world</em>!'

Click to view re.sub () function

The Re.escape function is a function that escapes all characters that may be interpreted as regular operators.

>>> re.escape ('hello.python')'hello\\.python  '

Regular from summer to see now, intermittent, don't give up AH

Regular expressions of Python learning notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Regular expressions of Python learning notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Regular expressions of Python learning notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support