Python regular expression,

Source: Internet
Author: User
Tags control characters

Python regular expression,
Python Regular Expression learning Summary: 1. First, we recommend the Learning Website:

Cainiao learning: http://www.runoob.com/python/python-reg-expressions.html

MOOC: http://www.imooc.com/learn/550

Self-improvement school: http://code.ziqiangxuetang.com/regexp/regexp-tutorial.html

2. recommended books: Basic python tutorials and core python programming (with python basics) 3. My personal summary:

The most important thing about python is source code learning!

Python has added the re module since version 1.5. It provides the Perl-style regular expression mode. The re module enables the Python language to have all the regular expression functions.

Regular Expression match mainly includes: single character match, boundary match, Character Set match, restriction and negation, group match and extended notation. The following is a summary:

1. Single Character matching
\ Cx Match the control characters specified by x. For example, \ cM matches a Control-M or carriage return character. The value of x must be either a A-Z or a-z. Otherwise, c is treated as an original 'C' character.
\ D Match a numeric character. It is equivalent to [0-9].
\ D Match a non-numeric character. It is equivalent to [^ 0-9].
\ F Match a form feed. It is equivalent to \ x0c and \ cL.
\ N Match A linefeed. It is equivalent to \ x0a and \ cJ.
\ R Match a carriage return. It is equivalent to \ x0d and \ cM.
\ S Matches any blank characters, including spaces, tabs, and page breaks. It is equivalent to [\ f \ n \ r \ t \ v].
\ S Match any non-blank characters. It is equivalent to [^ \ f \ n \ r \ t \ v].
\ T Match a tab. It is equivalent to \ x09 and \ cI.
\ V Match a vertical tab. It is equivalent to \ x0b and \ cK.
\ W Match any word characters that contain underscores. It is equivalent to '[A-Za-z0-9 _]'.
\ W Match any non-word characters. It is equivalent to '[^ A-Za-z0-9 _]'.
\ Xn Match n, where n is the hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, '\ x41' matches "". '\ X041' is equivalent to '\ x04' & "1 ". The regular expression can be ASCII encoded.
\ Num Matches num, where num is a positive integer. References to the obtained matching. For example, '(.) \ 1' matches two consecutive identical characters.
\ N Identifies an octal escape value or a backward reference. If at least n subexpressions are obtained before \ n, n is backward referenced. Otherwise, if n is an octal digit (0-7), n is an octal escape value.
\ Nm Identifies an octal escape value or a backward reference. If at least one child expression is obtained before \ nm, the nm is backward referenced. If at least n records are obtained before \ nm, n is a backward reference followed by text m. If none of the preceding conditions are met, if n and m are Octal numbers (0-7), \ nm matches the octal escape value nm.
\ Nml If n is an octal number (0-3) and m and l are Octal numbers (0-7), the octal escape value nml is matched.
\ Un Match n, where n is a Unicode character represented by four hexadecimal numbers. For example, \ u00A9 matches the copyright symbol (?).
2. boundary matching
\ B Match A Word boundary, that is, the position between a word and a space. For example, 'er \ B 'can match 'er' in "never", but cannot match 'er 'in "verb '.
\ B Match non-word boundary. 'Er \ B 'can match 'er' in "verb", but cannot match 'er 'in "never '.
^ | \ Matches the start position of the input string. If the Multiline attribute of the RegExp object is set, ^ matches the position after '\ n' or' \ R.
$ | \ Z Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches the position before '\ n' or' \ R.
3. Character Set matching
[Xyz] Character Set combination. Match any character in it. For example, '[abc]' can match 'A' in "plain '.
[^ Xyz] Negative value character set combination. Match any character not included. For example, '[^ abc]' can match 'p' in "plain '.
4. Restriction and Negation

[A-z]

Character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase letter in the range of 'A' to 'Z.
[^ A-z] Negative character range. Matches any character that is not within the specified range. For example, '[^ a-z]' can match any character that is not in the range of 'A' to 'Z.
5. group matching

\ D + (\. \ d *)?

A string that represents a simple floating point number.
([\ W] +) \ w + \ 1 Matches html or xml tags. <span> python <span>
6. Extended notation
(? : Pattern) Matches pattern but does not get the matching result. That is to say, this is a non-get match and is not stored for future use. For example, 'industr (? : Y | ies) is a simpler expression than 'industry | industries.
(? = Pattern) Forward pre-query: matches the search string at the beginning of any string that matches the pattern.
(?! Pattern) Negative pre-query: matches the search string at the beginning of any string that does not match pattern.
X | y Match x or y. For example, 'z | food' can match "z" or "food ". '(Z | f) ood' matches "zood" or "food ".

The most important thing is naming combinations and non-Greedy use:

(* | + |? | {})? It is used to match the non-Greedy version of the above frequently repeated symbols.

(? P <name>...) # name is a valid identifier used to name a capture group.

In addition:

    It is best to use the original character pattern = r'pattern' to define pattern in python'

Attached python doc-re module:Https://docs.python.org/3/library/re.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.