Regular usage of RE regular

Source: Internet
Author: User

Introduction to 1.re the use of the Python re module, although not sufficient to meet all the complex matching situation, but enough in most cases to effectively implement the analysis of complex strings and extract relevant information. Python translates the regular expression into bytecode and uses the C language matching engine for depth-first matching.

Copy CodeThe code is as follows: Import re print re.__doc__


You can query the function information of the RE module, which is illustrated in several examples below.
The regular expression syntax for 2.re Regular expressions syntax is as follows:

Grammar Significance Description
"." Any character
"^" String start ' ^hello ' matches ' HelloWorld' and does not match ' aaaahellobbb '
"$" End of string In the same vein
"*" 0 or more characters (greedy match) <*> Matching <title>chinaunix</title>
"+" 1 or more characters (greedy match ) In the same vein
"?" 0 or more characters (greedy match ) In the same vein
*?,+?,?? Above three take the first matching result (non-greedy match ) <*> Matching <title>
{M,n} Repeat m to n for the previous character, {m} can also A{6} matches 6 A, a{2,4} matches 2 to 4 a
{m,n}? Repeat m to n for the first character and take as little as possible ' Aaaaaa' in a{2,4} will only match 2
"\\" Special character escapes or special sequences
[] Represents a character set [0-9], [A-Z], [A-Z], [^0]
"|" Or a| B, or arithmetic
(...) Match any expression in parentheses
(?#...) Annotations, which can be ignored
(?=...) Matches If ... Matches next, but doesn ' t consume the string. ' (? =test) ' matches hello in hellotest
(?! ...) Matches If ... doesn ' t match next. ‘(?! =test) ' if Hello is not behind test, match Hello
(? <= ...) Matches if preceded by ... (must be fixed length). ' (? <=hello) test ' matches test in Hellotest
(?<!...) Matches if not preceded by ... (must be fixed length). ' (? <!hello)test ' does not match test in Hellotest

The regular expression special sequence list is as follows:

Special sequence Symbols Significance
\a Match only at the beginning of the string
\z Match only at the end of a string
\b Match an empty string at the beginning or end
\b Match an empty string that is not at the beginning or end
\d equivalent to [0-9]
\d equivalent to [^0-9]
\s Match any whitespace character: [\t\n\r\r\v]
\s Match any non-whitespace character:[^\t\n\r\r\v]
\w Match any number and letter: [A-za-z0-9]
\w Match any non-number and letter: [^a-za-z0-9]

The main function function of 3.re

Common function functions include: Compile, search, match, split, FindAll (Finditer), sub (SUBN) Compile re.compile (pattern[, flags]) Function: Converts the regular expression syntax into a regular expression object The flags definition includes: RE. I: Ignore case re. L: Represents a special character set \w, \w, \b, \b, \s, \s dependent on the current environment re. M: Multi-line mode re. S: '. ' and any character including newline characters (note: '. ' Do not include line breaks ' re. U: Represents a special character set \w, \w, \b, \b, \d, \d, \s, \s dependent on Unicode character Property database

Search Re.search (pattern, string[, flags]) search (string[, pos[, Endpos]): Finds the position in the string that matches the regular expression pattern, returns an instance of Matchobject, If no matching location is found, none is returned.

Match Re.match (pattern, string[, flags]) match (string[, pos[, Endpos]): the match () function attempts to match the regular expression only at the beginning of the string, that is, only report from position 0 The match begins, and the search () function scans the entire string to find a match. If you want to search the entire string for a match, you should use Search ().

Here are a few examples: the most basic usage, through re. Regexobject Object Invocation

Copy CodeThe code is as follows: #!/usr/bin/env python import re r1 = Re.compile (R ' World ') if R1.match (' HelloWorld '): print ' match succeeds ' else: print ' match fails ' if R1.search (' HelloWorld '): print ' search succeeds ' else:print ' search fails '

Note: R is the meaning of raw (raw). Because there are some escape characters in the representation string, such as the carriage return ' \ n '. If you want to indicate \ table needs to be written as ' \ \ '. But if I just need to represent a ' \ ' + ' n ', do not use the R method to write: ' \\n '. But using R means R ' \ n ' is much clearer.

Example: Setting flag

Copy CodeThe code is as follows: #r2 = Re.compile (R ' n$ ', re. S) #r2 = Re.compile (' \n$ ', re. S) r2 = re.compile (' world$ ', re. I) if R2.search (' helloworld\n '): print ' search succeeds ' else:print ' search fails '

Example: calling Directly

Copy CodeThe code is as follows: if Re.search (R ' ABC ', ' Helloaaabcdworldn '): print ' search succeeds ' else:print ' search fails '

Split Re.split (Pattern, string[, maxsplit=0, Flags=0]) split (string[, maxsplit=0]) function: You can split the part of a string match regular expression and return a list Example: Simple analysis IP

Copy CodeThe code is as follows: #!/usr/bin/env python import re r1 = re.compile (' w+ ') print r1.split (' 192.168.1.1 ') print Re.split (' (w+) ', ' 192.168. 1.1 ') Print Re.split (' (w+) ', ' 192.168.1.1 ', 1)

The results are as follows: [' 192 ', ' 168 ', ' 1 ', ' 1 '] [' 192 ', '. ', ' 168 ', '. ', ' 1 ', '. ', ' 1 '] [' 192 ', '. ', ' 168.1.1 ']

FindAll Re.findall (pattern, string[, flags]) FindAll (string[, pos[, Endpos]): Finds all substrings that match the regular expression in the string and makes up a list return example: Find [] What's included (greedy and non-greedy lookups)

Copy CodeThe code is as follows: #!/usr/bin/env python import re r1 = Re.compile (' ([. *]) ') Print Re.findall (R1, "Hello[hi]heldfsdsf[iwonder]lo") r1 = Re.compile (' ([. *?]) ') Print Re.findall (R1, "Hello[hi]heldfsdsf[iwonder]lo") print Re.findall (' [0-9]{2} ', " Fdskfj1323jfkdj ") Print Re.findall (' ([0-9][a-z]) '," FDSKFJ1323JFKDJ ") Print Re.findall (' (? =www) '," Afdsfwwwfkdjfsdfsdwww ") Print Re.findall (' (? <=www) '," afdsfwwwfkdjfsdfsdwww ")

Finditer Re.finditer (pattern, string[, flags]) Finditer (string[, pos[, Endpos]) Description: Similar to FindAll, finds all substrings that match the regular expression in the string, And form an iterator to return. The same regexobject are:

Sub Re.sub (Pattern, REPL, string[, Count, flags]) sub (REPL, string[, count=0]) Description: Finds all substrings that match the regular expression pattern in string strings, using another A string repl to replace it. If no string matching the pattern is found, a string that has not been modified is returned. Repl can be either a string or a function. Cases:

Copy CodeThe code is as follows: #!/usr/bin/env python import re p = Re.compile (' (one|two|three) ') Print p.sub (' num ', ' one word ', ' words three words Apple ', 2)

Subn re.subn (Pattern, REPL, string[, Count, flags]) subn (REPL, string[, count=0])

Description: The function has the same function as a sub (), but it also returns the new string and the number of substitutions. The same regexobject are:

Regular usage of RE regular

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.