Regular expressions of Python learning notes

Source: Internet
Author: User

Tag: Returns the object www. Flags Modeling Tail Center NET Abort

Regular expression: The pattern that matches the text fragment.

    • Wildcard character: matches more than one string. such as '. ' You can match all characters except the newline character, only one.
    • Escape of special characters: if you want to match the string ' python.org ', if you use ' python.org ' directly to match, it will not only match to ' python.org ', but also ' pythoniorg ' and other strings, at this time need to '. ' To be escaped, use ' python\\.org ' or R ' python\.org ' to match.
    • Character set: Use brackets to create character sets, including all the characters you want to match, such as ' [Pj]ython ' can match to ' python ' and ' Jython ', ' [a-za-z0-9] ' can match any uppercase and lowercase letters and numbers, and the character set can only match one character. The inverse character set, such as ' [^ABC] ', means that all characters except A, B, and C can be matched.
    • Selectors and sub-patterns: pipe symbol ' | ', used for selection, such as matching only ' Python ' and ' Jython ', using ' python|jython ' for matching, or just using selection operators for part of the pattern, such as ' (p|j) Ython ', using parentheses to enclose the desired part, Called Sub-mode,
    • Optional and repeating sub-modes: Adding a question mark after the sub-mode becomes optional, such as R ' (http://)? (www\.)? Python\.org ' can be matched to ' http://www. ', ' http: ', ' www. ' and ' python.org ' strings. A question mark indicates that the child mode is allowed to appear 0 or one time.

(pattern) *: Allow sub-mode to appear 0 or more times

(pattern) +: Allow sub-mode to appear 1 or more times

(pattern) {m,n}: Allow mode to repeat M to n times

    • The beginning and end of the string: the previous match is for the entire string, if you want to match the beginning or end of the string, you need to use the ' ^ ' tag, such as ' ^ht+p ' only the ' ht+p ' character that matches the string, and the ' $ ' identifier used to match the end of the qualifying string .

Common functions of the RE module

Function Describe
Compile (Pattern[,flags])

Create a Pattern object based on a string containing a regular expression

Search (Pattern,string[,flags])

Finding patterns in strings

Match (Pattern,string[,flags])

Match pattern at start of string

Split (pattern,string[,maxsplit = 0])

To split a string based on a pattern match

FindAll (pattern,string)

List all occurrences of a pattern in a string

Sub (pat,repl,string[,count = 0])

Replaces all Pat matches in a string with REPL

Escape (String)

Escapes all special regular expression characters in a string

For the matching function in the RE module, the match succeeds by returning the Matchobject object, which includes the substring information of the matching pattern, and which pattern matches which part of the information, these "parts" are called groups, and the group is the sub-pattern placed in the tuple brackets.

The mode ' There (is a (wee) (Cooper)) who (lived in Fyfe) ' contains the following groups:

0 There is a wee Cooper who lived in Fyfe

1 was a wee Cooper

2 Wee

3 Cooper

4 lived in Fyfe

An important way to match objects with re:

Method

Describe

Group ([group1,......])

Get a match for a given sub-pattern (group)

Start ([group])

Returns the starting position of a match for a given group

End ([group])

Returns the end position of a match for a given group

span ([group])

Returns the start and end positions of a group

 >>> import   re  >>> m = Re.match ( " www\. *)\.. {3}   ", "  www.python.org   " )  >>> G1 = M.group (1)  >>> M.group (1  " python   " >>> m.end (1 10>>> M.span (1 4, 10 >>> M.group (0)   "  www.python.org   " 

Add a '? ' after the repeat operator It turns the repetition operation into a non-greedy version.

re.split() function when cutting, if the pattern contains parentheses, the contents of the parentheses will exist between each substring.

Re.split (pattern,string[,maxsplit = 0])

The split function also has a parameter that limits the number of splits Maxsplit

The Re.findall function Returns all occurrences of a given pattern in the form of a list.

Re.findall (PATTERN,STR)

The re.sub () function replaces the leftmost and non-overlapping substrings with the specified content.

Re.sub (pat,repi,str[,count = 0])

The sub () function can be replaced by a group, and any escape sequence that occurs using ' \\n ' form in the replacement will be replaced with a string that matches the group N in the pattern.

For example, replace *something* in text with <em>something</em>

>>> Pat = r'\*([^\*]+)\*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Pat =re.compile (R'\*([^\*]+)\*')>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'

The repeating operator is greedy, and it makes as many matches as possible.

>>> Pat = r'\*(.+)\*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Re.sub (pat,r'<em>\1</em>','*hello* *world*!')'<em>hello* *world</em>!'

In this case, you need to use non-greedy mode, that is, add a '? ' after the repeat match.

>>> Pat = r'\*(.+?) \*'>>> Re.sub (pat,r'<em>\1</em>','Hello *world*!')'Hello <em>world</em>!'>>> Re.sub (pat,r'<em>\1</em>','*hello* *world*!')'<em>hello</em> <em>world</em>!'

Click to view re.sub () function

The Re.escape function is a function that escapes all characters that may be interpreted as regular operators.

>>> re.escape ('hello.python')'hello\\.python  '

Regular from summer to see now, intermittent, don't give up AH

Regular expressions of Python learning notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.