Python Learning notes-the difference between--re.match and Re.search in the "Fifth Week" of the basic article

Source: Internet
Author: User
Tags character classes modifiers uppercase letter

Regular ExpressionsGrammar:
import Re #导入模块名p = Re.compile ("^[0-9]")  #生成要匹配的正则对象, ^ represents a match from the beginning, [0-9] represents any number that matches 0 to 9, so here's what it means to match the passed in string, If the first character at the beginning of the string is a number, it means that the match is on M = P.match (' 14534ABC ')   #按上面生成的正则对象 to match the string, and if the match succeeds, the M will have a value, otherwise m is none

If M: #不为空代表匹配上了 print (M.group ()) #m. Group () Returns the result of the match, here is 1, because the match is on the 1 character
Else
Print ("doesn ' t match.")

The 2nd and 3rd lines above can also be combined into one line to write:

The effect is the same, the difference is that the first way is to match the format in advance to compile (to parse the matching formula), so that the match will not be in the compilation of matching format, the 2nd shorthand is each time the match is to be a matching formula to compile, so, If you need to match all the lines that begin with a number in a 5w line file, it is recommended that you compile and match the regular formula so that the speed will be faster.

Match format

Mode Description
^ Matches the beginning of a string
$ Matches the end of the string.
. Matches any character, except the newline character, when re. When the Dotall tag is specified, it can match any character that includes a line feed.
[...] Used to represent a set of characters, listed separately: [AMK] matches ' a ', ' m ' or ' K '
[^...] Characters not in []: [^ABC] matches characters other than a,b,c.
Tel Matches 0 or more expressions.
Re+ Matches 1 or more expressions.
Re? Matches 0 or 1 fragments defined by a preceding regular expression, not greedy
re{N}
re{N,} Exact match n preceding expression.
re{N, m} Matches N to M times the fragment defined by the preceding regular expression, greedy way
a| B Match A or B
(RE) The G matches the expression in parentheses, and also represents a group
(? imx) The regular expression consists of three optional flags: I, M, or X. Affects only the areas in parentheses.
(?-imx) The regular expression closes I, M, or x optional flag. Affects only the areas in parentheses.
(?: RE) A similar (...), but does not represent a group
(? imx:re) Use I, M, or x optional flag in parentheses
(?-imx:re) I, M, or x optional flags are not used in parentheses
(?#...) Comments.
(? = re) Forward positive qualifiers. If a regular expression is included, ... Indicates that a successful match at the current position succeeds or fails. But once the contained expression has been tried, the matching engine is not improved at all, and the remainder of the pattern attempts to the right of the delimiter.
(?! Re) Forward negative qualifier. As opposed to a positive qualifier, when the containing expression cannot match the current position of the string
(?> re) Match the standalone mode, eliminating backtracking.
\w Match Alpha-Numeric
\w Match non-alphanumeric numbers
\s Matches any whitespace character, equivalent to [\t\n\r\f].
\s Match any non-null character
\d Match any number, equivalent to [0-9].
\d Match any non-numeric
\a Match string start
\z Matches the end of the string, if there is a newline, matches only the ending string before the line break. C
\z Match string End
\g Matches the position where the last match was completed.
\b Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.
\b Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\ n, \ t, et. Matches a line break. Matches a tab character. such as
\1...\9 A sub-expression that matches the nth grouping.
\10 Matches the sub-expression of the nth grouping if it is matched. Otherwise, it refers to an expression of octal character code.

  

Regular expressions commonly used 5 kinds of operations

Re.match (Pattern, string) # match from scratch

Re.search (Pattern, String) # matches the entire string until a match is found

Re.split () # splits the matched format into a list as a split point pair of strings

>>>m = Re.split ("[0-9]", "Alex1rain2jack3helen Rachel8") >>>print (m)

Output: [' Alex ', ' rain ', ' Jack ', ' Helen Rachel ', ']

Re.findall () # Find all the characters to match and return to the list format

>>>m = Re.findall ("[0-9]", "Alex1rain2jack3helen Rachel8") >>>print (m)

Output: [' 1 ', ' 2 ', ' 3 ', ' 8 ']

Re.sub (Pattern, Repl, String, Count,flag) # replaces the matched character

M=re.sub ("[0-9]", "|", "Alex1rain2jack3helen Rachel8", count=2) print (m)

Output: Alex|rain|jack3helen Rachel8

Regular expression Instance character matching
Example Description
Python Match "Python".
Character class
Example Description
[Pp]ython Match "python" or "python"
Rub[ye] Match "Ruby" or "Rube"
[Aeiou] Match any one of the letters within the brackets
[0-9] Match any number. Similar to [0123456789]
[A-z] Match any lowercase letter
[A-z] Match any uppercase letter
[A-za-z0-9] Match any letters and numbers
[^aeiou] All characters except the Aeiou letter
[^0-9] Matches characters except for numbers
Special character Classes
Example Description
. Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
\d Matches a numeric character. equivalent to [0-9].
\d Matches a non-numeric character. equivalent to [^0-9].
\s Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\w Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w Matches any non-word character. Equivalent to ' [^a-za-z0-9_] '.

The difference between Re.match and Re.search

Re.match matches only the beginning of the string, if the string does not begin to conform to the regular expression, the match fails, the function returns none, and Re.search matches the entire string until a match is found.

Regular Expression modifiers:option Flags

Regular expression literals may include a optional modifier to control various aspects of matching. The modifiers is specified as an optional flag. You can provide multiple modifiers using exclusive OR (|), as shown previously and could be represented by one of these−

Modifier Description
re. I performs case-insensitive matching.
Re. L interprets words according to the current locale. This interpretation affects the alphabetic group (\w and \w), as well as word boundary behavior (\b and \b).
Re. M makes $ match the end of a line (not just the end of the string) and makes ^ match the start of any line (not Just the start of the string).
Re. S makes a period (dot) match any character, including a newline.
Re. U interprets letters according to the Unicode character set. This flag affects the behavior of \w, \w, \b, \b.
Re. X Permits "cuter" regular expression syntax. It ignores whitespace (except inside a set [] or when escaped by a backslash) and treats unescaped # as a comment marker.< /td>

Several common regular examples:

Match phone number

PHONE_STR = "Hey My name is Alex, and my phone number was 13651054607, please call me if you are pretty!" PHONE_STR2 = "Hey My name is Alex, and my phone number was 18651054604, please call me if you are pretty!" m = Re.search ("(1) ([358]\d{9})", PHONE_STR2) if M:    print (M.group ())

Match IP V4

ip_addr = "inet 192.168.60.223 netmask 0xffffff00 broadcast 192.168.60.255" m = Re.search ("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d {1,3} ", ip_addr) print (M.group ())

Group matching addresses

ContactInfo = ' Oldboy School, Beijing changping shahe:010-8343245 ' match = Re.search (R ' (\w+), (\w+): (\s+) ', ContactInfo) #分组 "" ">>> match.group (1)  ' Doe '  >>> match.group (2)  ' John '  >>> Match.group (3)  ' 555-1212 ' "" "match = Re.search (R ' (? p<last>\w+), (? p<first>\w+): (? p<phone>\s+) ', ContactInfo ' "" ">>> match.group (' last ')  ' Doe '  >>> match.group (' First ')  ' John '  >>> match.group (' phone ')  ' 555-1212 ' ""

Match Email

email = "[email protected]   http://www.oldboyedu.com" m = Re.search (r "[0-9.a-z]{0,26}@[0-9.a-z]{0,20}.[ 0-9a-z]{0,8} ", email) print (M.group ())

JSON and Pickle

Two modules for serialization

    • JSON, used to convert between string and Python data types
    • Pickle for conversion between Python-specific types and Python data types

The JSON module provides four functions: dumps, dump, loads, load

The Pickle module provides four functions: dumps, dump, loads, load

Reference address

Http://www.cnblogs.com/alex3714/articles/5143440.html

Other common module learning

Http://www.cnblogs.com/wupeiqi/articles/4963027.html

Python Learning notes-the difference between--re.match and Re.search in the "Fifth Week" of the basic article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.