Python Regular Expressions

Source: Internet
Author: User

1. Iterator: The object implements the ITER () in its interior, ITER() method, you can use the next method to implement self-traversal.

Two. Python Regular expressions

1.python support for regular expressions with re-modules

2. See what Python modules the current system has: Help (' modules ')

Help (): interactive mode that supports two ways of calling (interactive mode invocation, function invocation)

Example: Interactive invocation

>> Help ()

Welcome to Python 3.5 ' s help utility!

If This is your first time using Python, you should definitely check out

The tutorial on the Internet at http://docs.python.org/3.5/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing

Python programs and using Python modules. To quit this help utility and

Return to the interpreter, just type "quit".

To get a list of available modules, keywords, symbols, or topics, type

"Modules", "keywords", "symbols", or "topics". Each module also comes

With a one-line summary of what it does; To list the modules whose name

Or summary contain a given string such as "spam", type "modules spam".

help> Modules

Function call

Help (' modules ')

3. Meta-characters of regular expressions

\s: blank character;
\s: non-whitespace character;
[\s\s]: any character;
[\s\s] : 0 to any number of characters;
[\s\s]
? : 0 characters, matches any character before the position;

\d: Numbers;

\b: Non-digital;

\w: Matching word is equivalent to: [a-za-z0-9_];

\w: Match non-word;

Rules:

. Match any single character;

* Match the previous character 0 or more times;

+ Match the previous character 1 or more times;

? Match the previous character 0 or one time;

{m} matches the previous character m times;

{M,n} matches the previous character m-n times;

{m,} matches the previous outer character at least m times up to an infinite number of times;

{, n} matches the previous character 0 to n times;

\ escape character;

[...] Character Set Example: [A-z];

. ? ? +? ?? {}? Make * + etc into non-greedy mode

Boundary match (does not consume the character to match the string to match)

^: Matches the beginning of the string and matches the beginning of each line in multiline mode;

$: Matches the end of the string and matches the end of each row in multiline mode;

\b: Matches the word boundary, does not match any characters, \b matches just a position, the side of the position is the character that makes up the word, the other side is a non-character, the beginning or end position of the string, and the \b is 0 width. ("word" is a substring of words defined by \w) \b equivalent to: (? <!\w) (? =\w) | (? <=\w) (?! \w);

\B:[^\B];

\a: Matches only the beginning of the string;

\z: Matches only the end of the string;

Group:

| Or, the left and right expression matches any one, it first tries to match | The expression on the left, and if the match succeeds, skips the expression to the right; is not included in the (), then it is the entire regular expression in the range.

() grouping; starting from the left side of the expression, the first encounters a grouping with a number plus 1; a grouping expression as a whole, followed by a quantity word; in a grouping expression | Valid only in this group. Example: (ABC) {3} (ABC|DEF) 123 (abc|def) {3}123

\number A string that references a grouping of number numbers to match. Example: (\d) ([A-z]) \1\2

Surround (Lookhead)

(? =): The order is sure to look around

(?!) : Sequential negative surround look

(? <=): Positive look around in reverse order

(? <!) : Reverse negative surround

4. Call the built-in method of re to complete the regular expression analysis

5.match (matching) objects:

Match (pattern, string, flags=0)

Try to apply the pattern at the start of the string, returninga match object, or None if no match was found.

m = Re.match (' A ', ' abc ')

All:

M.end m.group m.lastgroup m.re M.start

M.endpos m.groupdict M.lastindex M.regs m.string

M.expand m.groups M.pos M.span

Group ([Group1, ...]):

Gets the string that is intercepted by one or more groups, and returns a tuple when multiple parameters are specified. Group1 can use numbers or aliases; number 0 represents the entire matched substring; returns Group (0) when no parameters are filled; Groups that have not intercepted a string return none; The group that intercepted multiple times returns the last substring intercepted.

Groups ([default]):

Returns the string intercepted by all groups as a tuple. Equivalent to calling group (,... last). Default indicates that a group that does not intercept a string is replaced with this value, which defaults to none.

M.pos (pos:postion): returns from which location to start the search

M.endpos: Returns the location from which to end the search

M.start (): Returns the starting position of the substring intercepted by the specified pattern when it is matched

M.end (): Returns the substring intercepted by the specified pattern when it is matched at the end of the original string

6.search: Performs a regular expression search and returns the matched string at the end of the search, returning only the results of the first match

Search (pattern, string, flags=0)

Scan through string looking for a match to the pattern, returninga match object, or None if no match was found.

M.group ()

M.groups ()

7.findall: Matches all objects, returns a list

FindAll (Pattern, string, flags=0)

Return a list of all non-overlapping matches in the string.If one or more capturing groups are present in the pattern, returna list of groups; this will be a list of tuples if the patternhas more than one group.Empty matches are included in the result.

Direct Printing Results

8.finditer (not much used)

Finditer (Pattern, string, flags=0)

Return an iterator(迭代器) over all non-overlapping matches in thestring.  For each match, the iterator returns a match object.Empty matches are included in the result.

9.split

Split (pattern, String, maxsplit=0, flags=0)

Split the source string by the occurrences of the pattern,returning a list containing the resulting substrings.  Ifcapturing parentheses are used in pattern, then the text of allgroups in the pattern are also returned as part of the resultinglist.  If maxsplit is nonzero, at most maxsplit splits occur,and the remainder of the string is returned as the final elementof the list.

Example: a = Re.split ('. ', ' www.baidu.com ')

Direct Printing Results

10.sub: Implement find and replace

Sub (pattern, REPL, String, count=0, flags=0)

Return the string obtained by replacing the leftmostnon-overlapping occurrences of the pattern in string by thereplacement repl.  repl can be either a string or a callable;if a string, backslash escapes in it are processed.  If it isa callable, it‘s passed the match object and must returna replacement string to be used.

Example: in [+]: re.sub (' Baidu ', ' Baidu ', ' www.baidu.com ')

OUT[47]: ' www.BAIDU.com '

11.SUBN: Find replacements and show the number of replacements

Cases:

In []: re.subn (' Baidu ', ' Baidu ', ' www.baidu.com ')

OUT[48]: (' www.BAIDU.com ', 1)

Flags

Re. I or ignorecase: Ignore character case

Re. M or multiline: multi-line matching

Re. A or ASCII: only 8-bit ASCII character matching is performed

Re. U or Unicode: Using \w,\w

Re. S (Dotall): "." matches any character at all, including the newline. Make. can match \ n characters.

Re. X (VERBOSE): Ignore whitespace and comments for nicer looking RE ' s. Allows comments to be added to regular expression rules, but all spaces are removed by default.

12. Remove the Priority capture:

XXX (?:) Xxx

?:: Remove priority capture when grouping

? P<>:

(? P<name>, ...)

Similar to regular parentheses, but the substring matched by the group is accessible via the symbolic group name name. Group names must is valid Python identifiers, and each group name must is defined only once within a regular expression. A symbolic group is also a numbered group, and just as if the group were not named.

Named groups can be referenced in three contexts. If the pattern is (? P<quote>[' "]). *? (? P=quote) (i.e. matching a string quoted with either single or double quotes):

Context of reference to group ' quote ' Ways to reference it

In the same pattern itself

(? P=quote) (as shown)

\1

When processing match object M

M.group (' quote ')

M.end (' quote ') (etc)

In a string passed to the Repl argument of Re.sub ()

\g<quote>

\g<1>

\1

Python Regular Expressions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.