Python Regular Expressions

Source: Internet
Author: User

Python Regular expression re module

Search for a field in a hit text

Python has a library re

Import re

The regular expression. Point denotes any character

[A-z] This position must be a lowercase a to Z letter

Print (Lent (result))

Search for a field in a hit text

Python has a library re

Import re

The regular expression. Point denotes any character

[A-z] This position must be a lowercase a to Z letter

Print (Lent (result))

#!/usr/bin/pythonimport retext= ' file = open (' shi.txt ') for line in file:text=text+linefile.close () result = Re.findall ( ' a[z-z][a-z] ', text) print (result)

  

result = Re.findall (' (a.[a-z]) ', text) # Plus () means that the rest of the brackets are not

So that the left and right sides of the space are removed

Remove the duplicate result method:

result = Re.findall (' (a.[a-z]) ', text)

result = Set (Result)

Print (Result)

It's all started with a.

Uppercase: [Aa] indicates that the first letter can be uppercase A or lowercase letter a

* Can match more than one or no

A *:

Empty

A

Aa

Aaaaaaaaaaaaaaaa

Spaces

Can match a lot of space ' * ' can have a space can have countless spaces

result = Re.findall (' * ([aa].[ A-z]) ', text)

To match the afe reason except safe:

Afe preceded by a space with no space behind it

Solution: Two-stage filtering

result = Re.findall (' * ([aa].[ A-z]) | ([A]. [A-z]) ', text)

The results are divided into two pairs.

Look at the code:

Two parentheses the first parenthesis does not match, so use an empty one to indicate the right side.

* Can match more than one or no A *: Empty aaaaaaaaaaaaaaaaaaa space * can match a lot of spaces ' * ' can have spaces can have countless spaces result = Re.findall (' * ([aa].[ A-z] ', text) unexpectedly matches except safe's afe reason: there is no space behind the AFE to resolve: two-stage filter result = Re.findall (' * ([aa].[ A-z]) | ([A]. [A-z]) ', text) results are divided into two pairs of a pair of code: two parentheses the first parenthesis does not match the words to use an empty to show empathy to the right

  

#!/usr/bin/pythonimport retext= ' file = open (' shi.txt ') for line in file:text=text+linefile.close () result = Re.findall ( ' * ([Aa]. [A-z]) | ([A]. [A-z]) ', text) Final_result = set () #set () is a set for-pair in Result:if pair[0] not in Final_result:final_result.add (Pair[0]) #左边规则对应 Out called Pair[0]if pair[1] not in Final_result:final_result.add (pair[1]) #右边规则对应出来的叫pair [1]final_result.remove (') print ( Final_result)

 

A little summary:

The dot indicates that there is one character at any one character in this position

 

\d must be a number

\d+ has at least one number

(The difference A * can match to empty)

Use it for a moment:

#!/usr/bin/pythonimport retext= ' File=open (' shi.txt ') for line in File:text = text+linefile.close () result = Re.findall ( ' \d+ ', text) print (result)

  

\d{2} just matched to two

\d{2,3} can match to two to three

\w matches a letter ' a-za-z '

\w{2,3} matches two or three letters

A character that starts with a

F=open (' Imooc.txt ') for line in F:if line.startswith (' Imooc '):p rint line with a character beginning and ending with a statement #!/usr/bin/pythonimport Redef FIND_START_IMOOC (fname): F=open (fname) for line in F:if line.startswith (' Mooc '):p rint Line#find_start_imooc (' Imooc.txt ') def FIND_IN_MOOC (fname): F=open (fname) for line in F:if line.startswith (' Imooc ') and Line.endswith (' imooc\n ') ): #每一行结束都有/nprint Linefind_in_mooc (' imooc.txt ') #!/usr/bin/pythonimport redef Find_start_imooc (fname): F=open (fname F:if line.startswith (' Mooc '):p rint Line#find_start_imooc (' Imooc.txt ') def FIND_IN_MOOC (fname): F=open ( fname) for line in F:if line.startswith (' Imooc ') and Line[:-1].endswith (' Imooc '): #切片操作print Linefind_in_mooc (' Imooc.txt ')

  

Match the name of the variable with the beginning of the dash and letter

s3= ' 1 dsf se '

S3.split () returns the sliced

res = R ' t[io]p '

Square brackets Riga ^ means not including res = R ' t[^io]p '

^ Sharp horn for beginning of line R "^hello" only matches the beginning of the line is Hello

$ trailing R "hello$" matches only the end of the line

"T[abc$" at the end of A or B or C definitely not.

Just like in [^abc] ^ means except ABC

\d decimal [0-9]

\d non-numeric characters [^0-9]

\s any white space character [\t\n\r\f\v]

\s non-whitespace characters [^\t\n\r\f\v]

\w any alphanumeric [a-za-z0-9]

\w non-alphanumeric [^a-za-z0-9]

010_12345656

R=r "^010-\d{8}" repeats eight times preceding rule a{8} A repeats 8 times

*

R=r "ab*" 0 times to multiple times (not appearing once)

+ Match one or more times at least once with * difference

? Denotes dispensable

Greedy match with non-greedy match

R=r "ab+? "This will come out with the fewest matches and won't appear abbbbbbbb

{} about curly braces {M,n} repeats at least m times at most repeated n times

R=r "a{1,3}"

Match () returns an object if the match is on.

Csvt_re = Re.compile (R ' CSVT ', re. I)

Csvt_re.match (' csvt hello ')

Searcht () No matter where it is, it doesn't match.

Finditer return Iterator Object

Re=r ' C.. T

Re.sub (RS, ' Python ', ' csvt scat ')

Re.split (R ' [\+\-\*] ', s)

Python Regular Expressions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.