Regular expressions
Usage:
>>>import re
>>>s=r ' abc '
>>>re.findall (S, ' abcabc ')
Regular Syntax:
meta-characters:. ^ $ * +? {} [] \ | ()
[] used to select Match [ABC] match A or B or C
[^] used to reverse the selection, such as [^ABC] does not contain ABC
^ used to match the beginning of the R ' ^abc ' match ' ABCD ' instead of ' dabc '
$ for matching line endings similar to ^\: Used for escaping.
\d = number [0-9]
\d means the inverse of D [^0-9]
\s = white space character [\t\n\r\f\v]
\s represents the anti-S
\w = character number [a-za-z0-9_]
\w instead
{n} repeats n times like \d{8} matches 8 digits
* Repeat 0 to n times such as ab* match abbbbbbb
+ repeat 1 to n times
? repeat 0 or 1 times
Repeat back add? is the smallest match
R ' ab+? ' minimum match
{M.N} repeats m to n times
{0,} equals *
{1,} equals +
{0,1} equals?RE module use
1. Compiling
Import re
p=re.compile (r ' xxx ') frequently used words so faster
p.findall (' abcabc ')
you don't need it .
Re.findall (R ' a ', ' xxxx ')
Case insensitive re.compile (r ' xxx ', re. I)
2. Common Functions
match () Start matches return position
search () the entire string scan returns the location
findall () returns list
Finditer () return iterator
Match:
M.group () returns a matching value
Search:
Ibid .
string substitution:
rs=r ' C.. T '
Re.compile (RS)
rs.sub (' python ', ' csvt caat cvvt cccc ')
Cutting:
Split
s= "123+234-344"
Re.split (R ' [\+\-] ', s)GroupGroup (0) and group ()Group (n) returns the nth matching brace content
Python Learning Note (4): Regular expressions