The syntax of a Python regular expression is explained in an example

Source: Internet
Author: User
In the previous article, we introduced a general description of the Python regular expression of the bodysuits, in fact Regular Expressionsis a special sequence of characters that can help you conveniently check whether a string matches a pattern. Python has added the RE module since version 1.5, which provides a Perl-style regular expression pattern. The RE module enables the Python language to have all the regular expression functionality. The compile function generates a regular expression object based on a pattern string and an optional flag parameter. The object has a series of methods for regular expression matching and substitution. The RE module also provides functions that are fully consistent with these methods, which use a pattern string as their first parameter, and this section focuses on the common Regular ExpressionsProcessing functions.

strings are the most data structure involved in programming, and the need to manipulate strings is almost ubiquitous. For example, to determine whether a string is a legitimate email address, although it can be programmed to extract the substring before and after, and then judge whether it is a word and domain name, but this is not only cumbersome, and the code is difficult to reuse.

A regular expression is a powerful weapon used to match strings. Its design idea is to use a descriptive language to define a rule for a string, and any string that conforms to the rule, we think it "matches", otherwise the string is illegal.

So the way we judge whether a string is a legitimate email is:

1. Create a regular expression that matches the email;

2. Use the regular expression to match the user's input to determine whether it is legal.

Because the regular expression is also represented by a string, we first know how to describe the character with characters.

In regular expressions, if a character is given directly, it is exactly the exact match. With \d you can match a number, \w can match a letter or a number, so:

1. ' 00\d ' can match ' 007 ', but cannot match ' 00A ';

2. ' \d\d\d ' can match ' 010 ';

3. ' \w\w\d ' can match ' py3 ';

. can match any character, so:

5. ' py. ' Can match ' pyc ', ' pyo ', ' py! ' Wait a minute.

To match the variable length character, in the regular expression , use * to represent any character (including 0), with + for at least one character, to represent 0 or 1 characters, to represent n characters with {n}, and {n,m} to represent n-m characters:

Take a look at a complex example: \d{3}\s+\d{3,8}.

Let's read from left to right:

1.\d{3} indicates a match of 3 digits, e.g. ' 010 ';

2.\s can match a space (also including tab and other whitespace), so \s+ indicates at least one space, such as "," and so on;

3.\d{3,8} represents 3-8 digits, such as ' 1234567 '.

Together, the above regular expression can match a telephone number with an area code separated by any space.

What if I want to match a number like ' 010-12345 '? Because '-' is a special character, in the regular expression, to be escaped with ' \ ', so, the above is \d{3}\-\d{3,8}.

However, you still cannot match ' 010-12345 ' because there is a space. So we need more complex ways of matching.

Advanced

To make a more accurate match, you can use [] to represent a range, such as:

1.[0-9a-za-z\_] can match a number, letter, or underscore;

2.[0-9a-za-z\_]+ can match a string of at least one number, letter, or underscore, such as ' A100 ', ' 0_z ', ' Py3000 ' and so on;

3.[a-za-z\_][0-9a-za-z\_]* can be matched by a letter or underscore, followed by a string consisting of a number, letter, or underscore, which is a valid Python variable;

4.[a-za-z\_][0-9a-za-z\_]{0, 19} More precisely limits the length of a variable to 1-20 characters (the preceding 1 characters + 19 characters later).

a| B can match A or B, so (p|p) Ython can match ' python ' or ' python '.

^ Represents the beginning of a line, and ^\d indicates that it must begin with a number.

$ represents the end of the line, and \d$ indicates that it must end with a number.

You may have noticed that the Py can also match ' Python ', but with ^py$ it becomes an entire line match and only matches ' py '.

Re module

With the knowledge of readiness, we can use regular expressions in Python. Python provides the RE module, which contains the functionality of all regular expressions. Because the Python string itself is also escaped with \, pay special attention to:

s = ' abc\\-001 ' # python string # corresponding to the regular expression string becomes: # ' abc\-001 '

Therefore, we strongly recommend that you use the Python R prefix without having to consider escaping the problem:

s = R ' abc\-001 ' # python string # corresponding regular expression string invariant: # ' abc\-001 '

Let's look at how to tell if a regular expression matches:

>>> Import re>>> re.match (R ' ^\d{3}\-\d{3,8}$ ', ' 010-12345 ') <_sre. Sre_match object; span= (0, 9), match= ' 010-12345 ' >>>> re.match (R ' ^\d{3}\-\d{3,8}$ ', ' 010 12345 ') >>>

The match () method determines if the match is true and returns a match object if the match succeeds, otherwise Noneis returned. The common judgment method is:

Test = ' user input string ' if Re.match (R ' Regular expression ', test):    print (' OK ') Else:    print (' failed ')

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.