The Python module's re-regular expression

Source: Internet
Author: User
Tags function examples expression engine

First, Brief introduction

Regular expressions are a small, highly specialized programming language that is not unique to Python and is a fundamental and important part of many programming languages. In Python, this is done primarily through the RE module.

The regular expression pattern is compiled into a sequence of bytecode, which is then executed by a matching engine written in C. So what are the typical usage scenarios for regular expressions?

    • For example, specify rules for the corresponding set of strings that you want to match;

    • The string set can be a set of strings that contain e-mail addresses, Internet addresses, phone numbers, or custom on demand;

    • Of course, it is also possible to determine whether a set of strings conforms to our defined matching rules;

    • Find the part of the string that matches the rule;

    • Modification, cutting and so on a series of text processing;

    • ......

Ii. Special symbols and characters (meta-characters)

Here are some common meta-characters that give the regular expression a powerful function and flexibility. Table 2-1 lists the more common symbols and characters.

650) this.width=650; "Src=" http://images2015.cnblogs.com/blog/1094291/201701/ 1094291-20170124143013816-1984740724.png "style=" border:0px; "/>

Third, regular expression 1, using the compile () function to compile regular expressions

Since Python code is eventually translated into bytecode, it is then executed on the interpreter. So it's more convenient to do some regular expressions that are often used in our code to be precompiled.

Most functions in the RE module have the same name as the compiled regular expression object and the regular matching object and have the same functionality.

Example:

1

2

3

4

5

6

7

8

9

10

11

>>> importre

>>> r1 =r‘bugs‘# 字符串前加"r"反斜杠就不会被任何特殊方式处理,这是个习惯,虽然这里没用到

>>> re.findall(r1, ‘bugsbunny‘)             # 直接利用re模块进行解释性地匹配

[‘bugs‘]                         

>>>

>>> r2 =re.compile(r1)                     # 如果r1这个匹配规则你会经常用到,为了提高效率,那就进行预编译吧

>>> r2                                      # 编译后的正则对象

<_sre.SRE_Pattern objectat 0x7f5d7db99bb0>

>>>

>>> r2.findall(‘bugsbunny‘)                 # 访问对象的findall方法得到的匹配结果与上面是一致的

[‘bugs‘]                                    # 所以说,re模块中的大多数函数和已经编译的正则表达式对象和正则匹配对象的方法同名并且具有相同的功能

The Re.compile () function also accepts optional flag parameters, which are commonly used to implement different special functions and syntax changes. These flags can also be used as parameters for most of the RE module functions. These flags can be used with the operator (|) Merge.

Example:

1

2

3

4

5

6

7

8

9

>>> importre

>>> r1 =r‘bugs‘

>>> r2 =re.compile(r1,re.I)  # 这里选择的是忽略大小写的标志,完整的是re.IGNORECASE,这里简写re.I

>>> r2.findall(‘BugsBunny‘)

[‘Bugs‘]

# re.S 使.匹配换行符在内的所有字符

# re.M 多行匹配,英雄^和$

# re,X 用来使正则匹配模式组织得更加清晰

A complete list of flag parameters and usage can refer to the relevant official documentation.

2. Use regular expressions

The RE module provides an interface to the regular expression engine, which describes some commonly used functions and methods.

    • Match objects and the group () and groups () methods

When working with regular expressions, there is an object type: matching object, in addition to the regular expression object. These are the objects returned by the successful call to match () or search (). There are two main methods for matching objects: Group () and groups ().

Group () either returns the entire matching object or returns a specific subgroup as required. Groups () returns only one tuple that contains a unique or all child group. If there is no subgroup requirement, then when group () still returns the entire match, groups returns an empty tuple. Some of the following function examples demonstrate this method.

    • Match a string using the match () method

The match () function matches the pattern from the starting part of the string. If the match succeeds, a matching object is returned, and if the match fails, None is returned, and the method group () method of the matching object can be used to display that successful match.

Examples are as follows:

1

2

3

4

5

6

7

>>> m  =   re.match ( ' Bugs ' ,   ' Bugsbunny ' )       # pattern Match string

>>>  If   is   not   None :                        # If the match succeeds, the match is output

...     m.group ()

...

' Bugs '

>>> m

<_sre. sre_match  object   at  0x7f5d7da1f168 >   # to confirm the returned matching object

    • Use Search () to find patterns in a string

Search () works in exactly the same way as match (), except that search () is the first occurrence of a match for a given regular expression pattern. In simple terms, it is possible to match the success in any position, not just the starting part of the string, which is the difference from the match () function, and using the thumb to think of the search () method is more extensive.

Example:

1

2

3

4

5

>>> m =re.search(‘bugs‘‘hello bugsbunny‘)

>>> ifis notNone:

...     m.group()

...

‘bugs‘

    • Use FindAll () and Finditer () to find the location of each occurrence

FindAll () is used to find all (non-repeating) occurrences of the regular expression pattern in a string and returns a matching list; Finditer () differs from FindAll () in that it returns an iterator that returns a matching object for each match.

1

2

3

4

5

6

7

8

>>> m =re.findall(‘bugs‘‘bugsbunnybugs‘)

>>> m

[‘bugs‘‘bugs‘]

>>> m =re.finditer(‘bugs‘‘bugsbunnybugs‘)

>>> m.next()                                   # 迭代器用next()方法返回一个匹配对象

<_sre.SRE_Match objectat 0x7f5d7da71a58>      # 匹配用group()方法显示出来

>>> m.next().group()

‘bugs‘

    • Use sub () and SUBN () search and replace

is some form of substitution for all the parts of a string that match a regular expression. The sub () returns a string to replace, which defines the number of replacements, by default replacing all occurrences. SUBN () is the same as sub (), but SUBN () also returns a representation of the substitution, followed by the replacement string and the total number of replacements as a tuple of two elements.

Example:

1

2

3

4

5

6

>>> r =‘a.b‘

>>> m =‘acb abc aab aac‘

>>> re.sub(r,‘hello‘,m)

‘hello abc hello aac‘

>>> re.subn(r,‘hello‘,m)

(‘hello abc hello aac‘2)

The string also has a replace () method, which requires a more flexible sub () method when it encounters some fuzzy search substitution.

    • Use Split () to split a string

Similarly, there is split () in the string, but it also cannot handle the segmentation of regular expression matches. In the RE module, separating the pattern delimiter for the regular expression, the Split function splits the string into a list, and then returns a list of successful matches.

Example:

1

2

3

>>> s =‘1+2-3*4‘

>>> re.split(r‘[\+\-\*]‘,s)

[‘1‘‘2‘‘3‘‘4‘]

    • Group

Sometimes we just want to extract some of the information we want or to make a classification of the extracted information, then we need to group the regular matching pattern, just add ().

Example:

1

2

3

4

5

6

7

8

9

>>> m  =   re.match ( ' (\w{3})-(\d{3}) ' , ' abc-123 ' )

>>> m.group ()          # full match                          

' abc-123 '

>>> m.group ( 1 )        # Sub-group 1

' abc '

>>> m.group ( 2 )        # Child Group 2

' 123 '

>>> M.groups ()        # All sub-groups

( ' abc ' ,   ' 123 ' )

As can be seen from the above example, group () is usually used to display all matching parts in a normal way, but it can also be used to get each matching subgroup. You can use the groups () method to get a tuple that contains all the matching strings.


The Python module's re-regular expression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.