Regular expression notes three-regular expressions

Source: Internet
Author: User
Tags in python
First letter case Independent mode
There was a time when I wrote regular expressions to match drug keywords and often wrote expressions like/viagra|cialis|anti-ed/. In order to make it more beautiful, I would sort the keywords, and in order to speed up, I would use/[vv]iagra/instead of/viagra/i, just to make the necessary parts for the case-and-write mode. To be exact, I need a case-insensitive match for the first letter of each word.

I wrote a function that was dedicated to batch conversions.

Copy Code code as follows:

#convert regex to sorted list, then provide both Lower/upper case for the "a" of each word
#luf means lower Upper

Sub luf{
# split the regex with the delimiter |
My @arr =sort (split (/\|/,shift));

# provide both the upper and lower case for the
# a leffer of each word
foreach (@arr) {s/\b ([a-za-z])/[\l$1\u$1]/g;}

# Join the keyword to a regex again
Join (' | ', @arr);
}

Print Luf "Sex pill|viagra|cialis|anti-ed";
# The output is:[aa]nti-[ee]d| [cc]ialis| [Ss]ex [pp]ill| [Vv]iagra

Controls the location of the next start of a global match

Remember JYF once asked me how to control where the match started. Well, now I can answer that question. Perl provides a POS function that adjusts the position of the next match to begin in the/g global match. Examples are as follows:
Copy Code code as follows:

$_= "ABCDEFG";
while (/.. /g)
{
Print $&;
}

Its output is every two letters, AB, CD, EF

You can use POS ($_) to reposition the next match start, such as:

Copy Code code as follows:

$_= "ABCDEFG";
while (/.. /g)
{
POS ($_)--; #pos ($_) + +;
Print $&;
}

Output results:

Copy Code code as follows:

POS ($_)--: AB, BC, CD, DE, EF, FG.
POS ($_) + +: AB, DE.

You can read the section on POS in the Perl document for more information.

Hash and regular expression substitution
The third chapter of "EFFECTIVE-PERL-2E" has an example (see the code below) that escapes special symbols.
Copy Code code as follows:

My%ent = {' & ' => ' amp ', ' < ' => ' lt ', ' > ' => ' GT '};
$html =~ s/([&<>])/& $ent {$1};/g;

This example is very, very ingenious. It is flexible to use the hash of this data structure, the replacement of the part as a key, and its corresponding replacement content as value. So as long as there is a match will be captured, and then the captured part as key, back to value and applied to the replacement, reflecting the efficiency of the high-level language.

But can such Perl code be ported to Python? Python also supports regular, hash support (called Dictionary in Python), but it does not seem to support inserting too many fancy things into the substitution process (replacing inline variable interpolation).

Check the python documentation (execute Python under the shell, then import Re, then Help (re)):

Copy Code code as follows:

Sub (pattern, REPL, String, count=0)
Return the string obtained by replacing the leftmost
Non-overlapping occurrences of the pattern in string by the
Replacement repl. REPL can be either a string or a callable;
If a string, backslash escapes in it are processed. If it is
A callable, it ' s passed the match object and must return
A replacement string to be used.

It turns out that python, like PHP, supports the use of callable callback functions in the process of substitution. The default argument for this function is a matching object variable. As a result, the problem is simple:

Copy Code code as follows:

Ent={' < ': "LT",
' > ': "GT",
' & ': ' Amp ',
}

Def rep (Mo):
return Ent[mo.group (1)]

Html=re.sub (R "([&<>])", Rep, html)

The key point of the Python substitution function callback is that its argument is a matching object variable. With that in mind, check the manual to see what attributes the object has and use it to write a flexible and efficient Python replacement code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.