Python Learning path-Regular expressions

Source: Internet
Author: User

Regular Expressions

The approximate matching process for regular expressions is to take out the expression and compare the characters in the text, and if each character matches, the match succeeds; If a match is unsuccessful, the match fails.

The RE module is used for the operation of regular expressions.

I. Functions for matching1. FindAll (pattern,string,flags=0)

Matches all eligible elements in a string. Match, returns the list type element

Import= Re.findall ('\d+','hhh90080mmmbb2233pp'  )print(obj)
2. Match (Pattern,string,flags=0)

Matches the qualifying element from the starting position of the string, with a single match. Match on, return an object

Note: The element at the beginning of the string must match the regular expression, otherwise none is returned

Import= Re.match ('\d+','0008lkk') Print (Obj,type (obj)) if obj:     Print (Obj.group ())
3. Search (pattern,string,flags=0)

Look for matching elements in the string, single match. Match on, return an object

Import= Re.search ('\d+','hhh90080mmmbb2233pp'  )if  obj:    Print(Obj.group ())
4. Group () and groups ()

A grouping is done with (), and a regular expression can be grouped in parentheses

Group () Gets a string that is intercepted by more than one packet, or a whole string of all grouped matches

Groups () returns the string intercepted by all groups in the form of a tuple

ImportRea="123abc456ooo"Print(Re.search ("([0-9]*) ([a-z]*) ([ 0-9]*)", a). Group (0))#The number 0 represents the entire matched substring; When the parameter is not filled, it is equivalent to group (0)Print(Re.search ("([0-9]*) ([a-z]*) ([ 0-9]*)", a). Group#When multiple parameters are specified, they are returned as tuples (' 123 ', ' abc ')Print(Re.search ("([0-9]*) ([a-z]*) ([ 0-9]*)", a). Group (2))#returns the string ABC when a parameter is specifiedPrint(Re.search ("([0-9]*) ([a-z]*) ([ 0-9]*)", a). Groups ())#(' 123 ', ' abc ', ' 456 ')
5. Sub (pattern, REPL, String, count=0, flags=0)

To replace a string that matches a regular expression

Import"Hello World 001,789 Welcome"= re.sub ('\d+  ','amy'# is replaced only once, no count will replace all print( S_c)

More powerful than str.replace ().

6. Split (pattern,string,maxsplit=0,flag=0)

Splits according to the specified regular match.

Note: If the last character match succeeds, a space will be split

Import"split1nnnnn2mmmmm3"= Re.split ('[0-9] ' # split up once, Maxsplit will be split Print (N_c)
7. Compile ( strpattern[, Flag])

Compiles a regular expression in the form of a string into an object.

Import"you is so cool, oo"= re.compile (R'\w*oo\w*  '# compile the regular expression into a pattern object print# Find all the words that contain OO Match text with pattern, get match result, cannot match when will return none
Second, the matching syntax
Grammar Description An instance of an expression Matched string
Character
General characters Match itself Abc Abc
.

Match any character (except line break)

If flag Dotall is specified, matches any character, including line breaks

A.c Akc
\ An escape character that causes the character after it to become literal A\.c A.c
[...] Character set, [a *] match character is a or *; ' [A-z] matches any character from A to Z;
[^f] matches any character that is not f; [\d] matches the number.
Other special characters have no special meaning except-^ \ in the character set
[A*]k

Ak

*k

Quantity (used in characters or (...). After
* Match the previous character 0 or unlimited times
+ Match the previous character 1 or unlimited times
? Match a previous character 0 or 1 times
{m} Matches the previous character m times
{M,n} Matches the previous character M to n times, omitting m,0 to N; omitting n,m to infinity
Predefined characters (can be written in the [] character set)
\d Number: [0-9]
\d Non-numeric: [^\d]
\w Alphanumeric underline: [a-za-z0-9_]
\w [^\w]
Boundary matching
^ Match string start ^abc Abc

$

Match string End abc$

Abc

\a        

Just match the beginning of the string             &NBSP ;                          ,         &NB Sp  

\aabc          ABC
\z Match string only End abc\z ABC
\b matches between \w and \w; start or end of Word A\B!BC A !BC
\b [^\b] A\BBC ABC
logic, grouping
| | represents any one of the left and right expressions. Match from left
to right, the match on the left skips right
abc|def ABC
DfE
(...) the enclosed expression will be grouped; Group table
as a whole, followed by quantity. The | In expression
is only valid in this group.

does not encounter a grouped opening parenthesis from the left side of the expression,
number is added 1
(ABC) {2}

A (123|456) c
abcabc

a456c

Third, flag
# Flags # Ignore case # assume current 8-bit locale # assume Unicode locale # Make anchors look for newline # Make dot match newline # ignore whitespace and comments

Four, R original characters

The use of "\" as an escape character in regular expressions can cause a backslash to be disturbed. If you need to match the character "\" in the text, then 4 backslashes "\\\\" will be required in the regular expression expressed in the programming language: the first two and the last two are used to escape the backslash in the programming language, converted to two backslashes, and then escaped in the regular expression into a backslash. The native string in Python solves this problem well, and the regular expression in this example can be expressed using R "\ \". Similarly, a "\\d" that matches a number can be written as r "\d". With the native string, you no longer have to worry about missing the backslash, and the expression is more intuitive.

Learning content from: http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

Python Learning path-Regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.