Detailed Python3 regular Expressions (iii)

Source: Internet
Author: User
Tags character classes locale

Previous: detailed Python3 regular expression (ii)

This article translated from: https://docs.python.org/3.4/howto/regex.html

Bloggers have made some comments and changes to this ^_^

Module-level functions

Using a regular expression is not necessarily a way to create a schema object and then call its matching method. Because the RE module also provides some global functions, such as match (), search (), FindAll (), Sub (), and so on. The first parameter of these functions is the regular expression string, and the other parameters are the same as the method with the same name as the pattern object, and the return value is the same, returning None or matching object.

In fact, these functions just help you to automatically create a schema object and call the relevant function (the contents of the previous article, remember?). )。 They also store compiled schema objects in the cache so that they can be called directly in the future.

So should we just use these module-level functions, or do we compile a schema object and then call the Schema object method? This depends on how often the regular expression is used, and if our program uses regular expressions only occasionally, then the global function is convenient, and if our program uses a lot of regular expressions (for example, in a loop), it is recommended that you use the latter method, because precompiled words can save some function calls. But if it is outside the loop, because of the internal caching mechanism, the efficiency of the two is comparable.

Compile flags

The compile flag allows you to modify how regular expressions work. Under the RE module, the compilation flag has two names: full name and shorthand, for example IGNORECASE is I (if you are a fan of Perl, then you are blessed because these abbreviations are the same as Perl, for example, re.) The shorthand for VERBOSE is re. X). Below is a list of supported compiler flags:

Sign Meaning
Ascii,a Allows escape symbols \w,\b,\s and \d to match only ASCII characters
Dotall,s Makes. Match any symbol, including line break
Ignorecase,i Match is case insensitive
Locale,l Support for current language (region) settings
Multilne,m Multiline match, affecting ^ and $
Verbose,x (for ' extended ') To enable verbose regular expressions

Let's take a look at what they mean in detail:

A

Ascii

Causes \w,\w,\b,\b,\s and \s to match only ASCII characters, not full Unicode characters. This flag is only meaningful for Unicode mode, and ignores byte patterns.

S

Dotall

Makes. Can match any character, including line breaks. If this flag is not used, the. All characters except newline are matched.

I

IGNORECASE

Character classes and literal strings are not case-sensitive when matched. For example, the regular expression [a-Z] will also match the corresponding lowercase letters, like Fanfan can match Fanfan,fanfan, or Fanfan, and so on. If you do not set locale, the case of the language (locale) setting is not considered.

L

LOCALE

Makes \w,\w,\b and \b dependent on the current language (region) environment, not the Unicode database.

Locale is a function of C language, the main role is to eliminate differences between different languages. For example, you are working with French text and you want to use \w+ to match words, but \w only matches the words in [a-za-z] and does not match those French special symbols. If your system correctly sets up the French locale, then the C function tells the program that those special symbols should also be considered a character. When a regular expression is compiled, the locale flag is set, and \w+ is able to recognize the French, but the speed is affected.

M

Multilne

(^ and $ we haven't mentioned yet, don't worry, we have a little talk behind us ...)

Usually ^ matches only the beginning of the string, while $ matches the end of the string. When this flag is set, ^ matches not only the beginning of the string, but also the beginning of each line; $ matches not only the end of the character, but also the line end of each line.

X

VERBOSE

This flag allows your regular expression to be written better and more organized, because using this flag, the spaces are ignored (except for the spaces that appear in the character class and are escaped with backslashes); This flag also allows you to use annotations in the regular expression string, and the content behind the # symbol is a comment, The sibling is submitted to the matching engine (except that it appears in the character class and the # that is escaped with a backslash).

Below is the use of RE. VERBOSE example, you see the readability of the regular expression is not improved a lot:

Charref = Re.compile (r "" "&[#]                # Start numeric reference (     0[0-7]+         # octal Format   | [0-9]+          # decimal Format   | x[0-9a-fa-f]+   # hexadecimal format);                   # End Semicolon "" ", Re. VERBOSE)

If the VERBOSE flag is not set, the same regular expression will be written as:

Charref = Re.compile ("0[0-7]+|[ 0-9]+|x[0-9a-fa-f]+); ")

Annotations: which is more readable, I believe we have the bottom of the heart.

(End of this article)

Next: Detailed Python3 regular expression (iv)

If you like this article, please give me encouragement through the "comments" below ^_^

Detailed Python3 regular Expressions (iii)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.