Python regular re module

Source: Internet
Author: User

Today's content:

Knowledge Point one: Regular
What is a regular:
is to use a series of characters with special meanings to form a set of rules that are used to describe a character string.
The regular is used to remove a small string of rules in a large string.

Why use the Regular:
1. User Registration
2. Crawler programs

How to use:
Re.findall

Re.findall module:

\w letters, numbers, underscores
Print (Re.findall (' \w ', ' yangzz:age_18 '))
\w no letters, numbers, underscores
Print (Re.findall (' \w ', ' yangzz:age_18 '))

\s matches any blank string, equivalent to [\t\n\r\f] t is a table key
Print (Re.findall (' \s ', ' \tyangzz\n age_18 '))

\s matches any non-null character
Print (Re.findall (' \s ', ' \tyangzz\n age_18 '))

\d matches any number
Print (Re.findall (' \d ', ' \tyangzz\n age_18 '))
\d matches any non-numeric
Print (Re.findall (' \d ', ' \tyangzz\n age_18 '))

\ n matches a line break
Print (Re.findall (' \ n ', ' \tyangzz\n age_18 aaa bb\nb '))

\ t matches a tab
Print (Re.findall (' \ t ', ' \tyangzz\n age_18 aaa bb\nb '))


^ Take the beginning of the string (no empty list returned, first match from the beginning)
Print (Re.findall (' ^yang ', ' yangzz\n age_18 Yang aaa bb\nb ')) #[' Yang '

$ match End of string
Print (Re.findall (' b$ ', ' yangzz\n age_18 Yang aaa bb\nb ')) #[' B ']

. Matches any character, except line breaks
Print (Re.findall ('. ', ' yangzz\n age_18 *%& Yang aaa bb\nb ')

For example 2 A.G is actually a three-bit character,. Can be any non-line break (followed by re.) Dotall can also be taken in the middle. \ n)
Print (Re.findall (' a.g ', ' yangzz\n age_18 ang *%& Yang aaa bb\nb ')) #[' Ang ', ' Ang ']
Print (Re.findall (' a.g ', ' ya\nngzz\n age_18 a\ng *%& Yang aaa bb\nb ', re. Dotall)) #[' A\ng ', ' Ang ']

[A,e,b] Match []limian
Print (Re.findall (' [\ n] ', ' yangzz\n age_18 Yang aaa bb\nb ')) #[' \ n ', ' \ n ']
Print (Re.findall (' [a,e,b] ', ' yangzz\n age_18 yangzhizong aaa bb\nb ')) #[' A ', ' a ', ' e ', ' A ', ' a ', ' a ', ' a ', ' B ', ' B ', ' B ']

Print (Re.findall (' y[a-z]n ', ' yangzz\n age_18 Yan y1n yang aaa bb\nb '))

Take the characters no longer in parentheses
Print (Re.findall (' [^z,a] ', ' Yangzz '))


* Represents the left side of the character appears 0 or infinite times (conditions must have a, a, optional)
Print (Re.findall (' an* ', ' Yangzhizong Mawenjie annd ')) #[' an ', ' a ']


The character representing the left side appears 0 or 1 times (the condition must have a, a, a, only one on the line)
Print (Re.findall (' A ', ' Yangzhizong Mawenjie annd ')) #[' an ', ' a ', ' an ']


+ Represents the left character 1 or infinitely (the condition must have a, B at least one)
Print (Re.findall (' an+ ', ' Yangzhizong Mawenjie annd ')) #[' an ', ' ann ']

#匹配所有包含小数在内的数字
Print (Re.findall (' \d+\.? \d* ', "asdfasdf123as1.13dfa12adsf1asdf3")) #[' 123 ', ' 1.13 ', ' 12 ', ' 1 ', ' 3 ']

Idea: \d Find the number + indicates the left number appears 1 or infinite times???


. * Default is greedy match (default fetch from beginning to end)
Print (Re.findall (' a.*b ', ' a1b22222222b ')) #[' a1b22222222b ']

. *? for non-greedy matching: recommended use
Print (Re.findall (' a.*?b ', ' a1b22222222b ')) #[' 123 ', ' 1.13 ', ' 12 ', ' 1 ', ' 3 ']

{n,m} fixed value, encountered some take 1 and 2 together in a separate combination
{0,2} 0 starts by default will also print other printing but the reality is empty
Print (Re.findall (' a{0,2} ', ' yangzzage18yangaaabbb ')) #[', ' A ', ', ', ', ', ', ' a ', ' ', ' ', ' ', ' ', ', ', ' a ', ', ', ' AA ', ' A ', ', ', ', ', '
Print (Re.findall (' Ab{1,} ', ' abbb ')) #[' abbb ']

Print (Re.findall (' a[1*-]b ', ' a1b a*b A-B '))


[+-*/] When there is subtraction in parentheses, the minus sign should be placed on the left or right side of the middle words can only be preceded by \ [+\-*/] representative will-translate to different characters
Print (Re.findall (' a[+\-*/]b ', ' a12ba a++b a*b a-+b '))


Re.search () module
Only to find the first match and then return an object that contains matching information, which can be called by the group () method
Gets the matching string and returns none if the string does not match.

Print (Re.findall (' e ', ' Alex Make Love ')
Print (Re.search (' e ', ' Alex Make Love ') #<_sre. Sre_match object; Span= (2, 3), match= ' E ' >
Print (Re.search (' e ', ' Alex Make Love '). Group ()) # E

Match is searched from the beginning, none returns none
Print (Re.match (' e ', ' Alex Make Love ')


Split Split Module
[', ', ', ' CD '], first press ' a ' to split ' and ' BCD ', then ' and ' BCD ' separately by ' B ' split
Print (Re.split (' [ab] ', ' ABCD '))


Replace the Re.sub () module
1. Do not add parameters by default replace All
Print (Re.sub (' A ', ' a ', ' Abcdeaedaer ', 1)) #Abcdeaedaer

Replace SUBN (output can show the total number of replacements)
Print (RE.SUBN (' A ', ' a ', ' Abcdeaedaer '))

Re.complie () module
Can be custom compiled, easy to call directly
Obj=re.compile (' \d{2} ')
Print (Obj.search (' Ab123ee '). Group ()) #12
Print (Obj.findall (' Ab123ee ')) #[' 12 ']

Python regular re module

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.