python之Regex

最後更新：2015-05-09 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

標籤：

http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

Python中寫Regex時推薦使用原生字串

數量詞的貪婪模式與非貪婪模式

Regex通常用於在文本中尋找匹配的字串。Python裡數量詞預設是貪婪的（在少數語言裡也可能是預設非貪婪），總是嘗試匹配儘可能多的字元；非貪婪的則相反，總是嘗試匹配儘可能少的字元。例如：Regex"ab*"如果用於尋找"abbbc"，將找到"abbb"。而如果使用非貪婪的數量詞"ab*?"，將找到"a"。

Regex元字元

^ 匹配字串的開始
$ 匹配字串的結束
？對於前一個字元字元重複0次到1次
* 對於前一個字元重複0次到無窮次
+ 對於前一個字元重複1次到無窮次
{m} 對於前一個字元重複m次
{m，n} 對前一個字元重複為m到n次
\d 匹配數字，相當於[0-9]
\D 匹配任何非數字字元，相當於[^0-9]
\s 匹配任意的空白符，相當於[ fv]
\S 匹配任何非空白字元，相當於[^ fv]
\w 匹配任何字母數字字元，相當於[a-zA-Z0-9_]
\W 匹配任何非字母數字字元，相當於[^a-zA-Z0-9_]
. 匹配除分行符號以外的任一字元
[...] 字元集，所有的特殊字元在這裡都是去意義(除了]、-、^, 這幾個可以用\轉義)

匹配模式

re.I(re.IGNORECASE): 忽略大小寫（括弧內是完整寫法，下同）
M(MULTILINE): 多行模式，改變‘^‘和‘$‘的行為（參見）
S(DOTALL): 點任意匹配模式，改變‘.‘的行為(.可以跨行匹配)
L(LOCALE): 使預定字元類 \w \W \b \B \s \S 取決於目前範圍設定
U(UNICODE): 使預定字元類 \w \W \b \B \s \S \d \D 取決於unicode定義的字元屬性
X(VERBOSE): 詳細模式。這個模式下Regex可以是多行，忽略空白字元，並可以加入注釋。

常用的Regex處理函數

1.re.search(pattern, string, flags=0)
re.search 函數會在字串內尋找模式比對，直至找到第一個匹配然後返回，如果字串沒有匹配，則返回None。

第一個參數：規則
第二個參數：表示要匹配的字串
第三個參數：標緻位，用於控制Regex的匹配方式

name="Hello,My name is kuangl,nice to meet you..."k=re.search(r‘k(uan)gl‘,name)if k:printk.group(0),k.group(1)else:print"Sorry,not search!"------------------------- kuangl uan

2.re.match(pattern, string, flags=0)
re.match 嘗試從字串的開始匹配一個模式，也等於說是匹配第一個單詞

name="Hello,My name is kuangl,nice to meet you..."k=re.match(r"(\H....)",name)if k:print k.group(0), k.group(1)else:print"Sorry,not match!"--------------------------Hello Hello

re.match與re.search的區別：re.match只匹配字串的開始，如果字串開始不符合Regex，則匹配失敗，函數返回None；而re.search匹配整個字串，直到找到一個匹配。

3.re.findall(pattern, string, flags=0)
返回的結果是一個列表，建中存放的是符合規則的字串，如果沒有符合規則的字串唄找到，就會返回一個空值。

mail=‘<[email protected]> <[email protected]> [email protected]‘re.findall(r‘(\[email protected][a-z]{3})‘,mail)----------------------------------------------------[‘[email protected]‘, ‘[email protected]‘, ‘[email protected]‘]

4、re.sub(pattern, repl, string, count=0)
re.sub 用於替換字串的匹配項
第一個參數：規則
第二個參數：替換後的字串
第三個參數：字串
第四個參數：替換個數。預設為0，表示每個匹配項都替換

test="Hi, nice to meet you where are you from?"re.sub(r‘\s‘,‘-‘,test)re.sub(r‘\s‘,‘-‘,test,5) ---------------------------------------‘Hi,-nice-to-meet-you-where-are-you-from?‘‘Hi,-nice-to-meet-you-where are you from?‘

5.re.split(pattern, string, maxsplit=0)

test="Hi, nice to meet you where are you from?"re.split(r"\s+",test)re.split(r"\s+",test,3) --------------------------------------------------[‘Hi,‘, ‘nice‘, ‘to‘, ‘meet‘, ‘you‘, ‘where‘, ‘are‘, ‘you‘, ‘from?‘][‘Hi,‘, ‘nice‘, ‘to‘, ‘meet you where are you from?‘]

6.re.compile(pattern, flags=0)
可以把Regex編譯成一個正則對象

pattern = re.compile(r‘hello‘)match = pattern.match(‘hello world!‘)print match.group()-------------------------------------hello

2015-05-09

python之Regex

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More