The regular expression of Python

Source: Internet
Author: User
Tags locale locale setting

Http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

Native string is recommended when writing regular expressions in Python

Greedy mode and non-greedy mode of quantitative words

Regular expressions are typically used to find matching strings in text. The number of words in Python is greedy by default (which may be the default non-greedy in a few languages), always trying to match as many characters as possible, and not greedy, instead, always trying to match as few characters as possible. For example: the regular expression "ab*" will find "abbb" if it is used to find "ABBBC". And if you use a non-greedy quantity word "ab*?", you will find "a".

Regular expression meta-characters

^ Start of matching string
$ match End of string
? Repeat 0 to 1 times for previous character characters
* Repeat 0 times to infinity for the previous character
+ repeat 1 times to infinity for the previous character
{m} repeats m times for previous character
{M,n} repeats to the previous character M to n times
\d match number, equivalent to [0-9]
\d matches any non-numeric character equivalent to [^0-9]
\s matches any whitespace character equivalent to [FV]
\s matches any non-whitespace character, equivalent to [^ FV]
\w matches any alphanumeric character, equivalent to [a-za-z0-9_]
\w matches any non-alphanumeric character equivalent to [^a-za-z0-9_]
. Match any character other than line break
[...] Character sets, all of the special characters here are going to make sense (except] 、-、 ^, which can be escaped with \)

Matching mode

Re. I (re. IGNORECASE): Ignore case (full notation in parentheses, same as below)
M (MULTILINE): Multiline mode, changing the behavior of ' ^ ' and ' $ ' (see)
S (dotall): Point any matching pattern, change '. ' Behavior (. Can match across rows)
L (LOCALE): Make a predetermined character class \w \w \b \b \s \s depends on the current locale setting
U (UNICODE): Make a predetermined character class \w \w \b \b \s \s \d \d Depending on the UNICODE-defined character attribute
X (VERBOSE): Verbose mode. In this mode, the regular expression can be multiple lines, ignore whitespace characters, and can be added to comments.

Common regular expression processing functions

1.re.search (Pattern, string, flags=0)
The Re.search function looks for pattern matching within a string until the first match is found and then returns none if the string does not match.

First parameter: rule
Second argument: Represents the string to match
Third parameter: Peugeot bit, used to control how regular expressions are matched

Name="hello,my name is Kuangl,nice to meet ... " k=re.search (R'K (uan) GL', name)if  k:printk.group (0), K.group (1)else:print"sorry,not search! "------------------------- kuangl Uan

2.re.match (Pattern, string, flags=0)
Re.match tries to match a pattern from the beginning of the string, which is equal to the first word.

Name="hello,my name is Kuangl,nice to meet ... " k=re.match (R"(\h ...) " , name) if K: print k.group (0), K.group (1)else:print"  Sorry,not match! "--------------------------Hello Hello

The difference between Re.match and Re.search: Re.match matches only the beginning of the string, if the string starts not conforming to the regular expression, the match fails, the function returns none, and the Re.search matches the entire string until a match is found.

3.re.findall (Pattern, string, flags=0)
The returned result is a list of strings that match the rules, and a null value if there are no strings that match the rules.

Mail='<[email protected]> <[email protected]> [email protected]'  Re.findall (R'(\[email protected][a-z]{3})', mail)---------------------- ------------------------------['[email protected]'[ Email protected]"[email protected]']

4, Re.sub (Pattern, Repl, String, count=0)
Re.sub to replace a string match
First parameter: rule
Second parameter: replaced string
The third argument: a string
Fourth parameter: Number of replacements. The default is 0, which means that each match is replaced

test="Hi, Nice to meet."Re.sub (R'\s','-', test) re.sub (R'\s','-', test,5) ---------------------------------------'hi,-nice-to-meet-you-where-are-you-from?''Hi,-nice-to-meet-you-where is?'

5.re.split (Pattern, string, maxsplit=0)

test="Hi, Nice to meet."Re.split (R"\s+", test) Re.split (R"\s+", test,3) --------------------------------------------------['Hi,',' Nice',' to','Meet',' You','where',' is',' You','From ?']['Hi,',' Nice',' to','meet you where is your from?']

6.re.compile (pattern, flags=0)
The regular expression can be compiled into a regular object

Pattern = Re.compile (r'hello'= Pattern.match ('Hello world! ' )print  match.group ()-------------------------------------Hello

2015-05-09

The regular expression of Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.