Python3 How to use regular expressions gracefully (detailed three)

Source: Internet
Author: User
Tags character classes locale

Module-level functions

Using regular expressions does not necessarily create a schema object, and then call its matching method. Because the RE module also provides a number of global functions, such as match (), search (), FindAll (), Sub (), and so on. The first parameter of these functions is a regular expression string, and other parameters take the same parameter as the method with the same name as the pattern object, and the return value also returns None or a matching object.

>>> Print (Re.match (R ' from\s+ ', ' from_fishc.com '))
None
>>> re.match (R ' from\s+ ', ' From fishc.com ')
< _sre. Sre_match object; Span= (0, 5), match= ' from ' > copy code
In fact, these functions just help you create a pattern object automatically, and call the relevant function (the previous article, remember.) )。 They also store the compiled schema objects in the cache so that they can be invoked quickly and directly in the future.


So should we use these module-level functions directly, or do we compile a schema object first and then call the schema object's method? This actually depends on how often the regular expression is used, if our program is only occasionally using regular expressions, then global functions are more convenient; if our program is a large number of regular expressions used (for example, in a loop), then it is recommended that you use the latter method, Because precompiling can save some function calls. But if it is outside the loop, the efficiency is almost the same because it benefits from the internal caching mechanism. The


Compile flags

Compile flags allow you to modify the way regular expressions work. Under the RE module, the compile flags have two names: full name and shorthand, such as IGNORECASE I (if you are a fan of Perl, you are blessed because these abbreviations are the same as Perl, for example, re.) The shorthand for VERBOSE is re. X). In addition, multiple flags can be used simultaneously (through the "|" ), such as: Re. I | Re. M is to set the I and M flags at the same time. Some supported compilation flags are listed below

:

Sign Meaning
ASCII, A Make escape symbols such as \w,\b,\s and \d only match ASCII characters
Dotall, S Makes. Match any symbols, including line breaks
IGNORECASE, I Matches are case-insensitive
LOCALE, L Support for current language (locale) settings
MULTILINE, M Multiple lines matching, affecting ^ and $
VERBOSE, X (for ' extended ') Enable verbose regular expressions


Let's go over what they mean in detail:

A
Ascii
Makes \w, \w, \b, \b, \s, and \s match only the ASCII characters, not the full Unicode characters. This flag only has meaning for Unicode mode and ignores byte mode.

S
Dotall
Makes. You can match any character, including line breaks.  If you do not use this flag,. Will match all characters except line breaks.

I
IGNORECASE
Character classes and text strings are not case-sensitive when matched. For example, regular expressions [A-z] will also match the corresponding lowercase letters, like FISHC can match FISHC, FISHC, or FISHC. If you do not set the LOCALE, the language (locale) setting is not considered to be a case in point.

L
LOCALE
Makes \w, \w, \b, and \b dependent on the current language (locale) environment rather than the Unicode database.

Locale is a function of C language, the main role is to eliminate the differences between different languages. For example, you are working on the French text, you want to use \w+ to match the words, but the \w just matches the words in [a-za-z] and does not match ' e ' or ' C '. If your system is properly set up in the French locale, then the C language function will tell the program ' E ' or ' C ' should also be considered a character. When compiling a regular expression, the LOCALE flag is set, and the \w+ can recognize French, but the speed is somewhat affected.

M
MULTILINE
(^ and $ we haven't mentioned, don't worry, we have a little talk behind ... )

Normally ^ matches only the beginning of a string, and $ matches the end of the string. When this flag is set, ^ not only matches the beginning of the string, but also matches the beginning of each line; & matches not only the end of the string, but also the end of the line.

X
VERBOSE
This logo makes your regular expression more visible and organized, because this flag is used, spaces are ignored (except in character classes and spaces that are escaped using backslashes); This flag also allows you to use annotations in the regular expression string, and the # symbol follows the comment. Will not be submitted to the matching engine (except the # that appears in the character class and is escaped using a backslash).

Below is the use of RE. VERBOSE, you can see whether the readability of regular expressions has improved a lot:
Charref = Re.compile (r "" "" "
& [#] # Start a digital reference
(
0[0-7]+ # octal format
| [0-9]+ # decimal format
| x[0-9a-fa-f]+ # hexadecimal format
)
; # End Semicolon
"", Re. VERBOSE) Copy Code
If the VERBOSE flag is not set, the same regular expression is written as: Charref = Re.compile ("&#" (0[0-7]+|[ 0-9]+|x[0-9a-fa-f]+); ") Copy Code
Which is more readable. I believe that the bottom of our hearts.


Original connection: Click to open the link



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.