Perl Regular Expressions

Source: Internet
Author: User
Tags character classes control characters numeric lowercase modifier regular expression scalar alphanumeric characters


By default, the m//operator attempts to match the text in the specified pattern and $_. For example: Find the string in the text entered by the user exit (the i-modifier after the second backslash is the pattern match is case insensitive). If exit is found in $_, m//returns true;

print "\ n---------------------------(m//) Demo------------------------\ n";
 Print "Enter:";
 while (<>) {
 if (m/exit/i) {exit;}
 }


The =~: operator specifies the string that the m//operator looks for. Here, the specified operator should look for scalar $line instead of $_. This code does not change the value of $line:

print "\ n---------------------------(m//) and (=~) Demo------------------------\ n";
 Print "Enter:";
 while ($line =<>) {
 if ($line =~m/exit/i) {exit;}
 }


!~: operator to reverse the return value of the =~.

print "\ n---------------------------(m//) and (=~) (!) Demo------------------------\ n ";
 Print "Enter:";
 while ($line =<>) {
 if (!) ( $line =~m/exit/i)) {exit;}
 }


The m//operator is frequently used, can ignore the M section, and most programs use the following shortcut:

print "\ n---------------------------(m//) and (=~) (!) Demo------------------------\ n ";
 Print "Enter:";
 while ($line =<>) {
 if ($line =~/exit/i) {exit;}
 }


As with other Perl operators, if you do not like slashes, you can use your own delimiters, in which case you must use m;

print "\ n---------------------------User Defined limitenotation------------------------\ n";
 Print "Enter:";
 while ($line =<>) {
 if ($line =~ m{exit}i) {exit;}
 }


PS: In scalar context, m//returns True or false; In the context of a table, if the "G" modifier is used for global lookups, m//returns a list of all matching values.

For example, create an array of @a, which will hold all lowercase words in $_;

print "\ n-----use ' g ' to find list match value------\ n";
$_= "Hereis the text";
@a=m/\b[^a-z]+\b/g;
print "@a";
print "\ n-----END------\ n";


On the face of the program is resolved as follows:

\b: Match word range

[^a-z]: matches any character except capital letters;

+: Ensure multiple matching values can be found;

g modifier: Description is a global lookup, and global lookups can find all consecutive matching values.


The s///operator can replace another string with one string.

Example: Replacing a string with a string old

print "\ n--------------------------(s///) used-----------------------------\ n";
$text _young= "Prettyyoung.";
Print "$text _young\n";
$text _young=~s/young/old/;
Print "$text _young\n";
print "\ n------------------------END (s///) used---------------------------\ n";


ps:m//and s///are matched from the left.

----------------------------------------------------tr/// Operator--------------------------------------------------------------

$text = "His name is Tom.";
$text =~tr/o/j/;
Print $text. " \ n ";
print "\ n------------------------END (tr///) used---------------------------\ n";


Regular expression: \b ([a-za-z]+) \b matches a word in a text string:

print "\ n--------------------------(\b ([a-za-z]+) \b) used-----------------------------\ n";
$text = "Prettyyoung.";
Print "$text _young\n";
$text =~/\b ([a-za-z]+) \b/;
print "$1\n";
print "\ n------------------------END (\b ([a-za-z]+) \b) used---------------------------\ n";


Example Analysis:

Expressions (\b ([a-za-z]+) \b) contain grouping metacharacters, \b Boundary metacharacters, and character classes [a-za-z] (matching all uppercase and lowercase letters) and quantifier +, which specifies that one or more characters are found in the specified character class.

Perl remembers a match, the preceding code is called $ $, and the first word in the string is printed.

In a regular expression, any single character matches itself, unless it is a metacharacters (for example, $ and ^) that has a special meaning.

print "\ n--------------------------(' $ and ^ ') used-----------------------------\ n";
 while (<>) {
 if (m/^exit$/i) {
   {exit;}}
 }
 print "\ n------------------------END (' $and ^ ') used---------------------------\ n";


Special characters in Perl:

\077-------------8 binary characters

\a-------------Alarm (ringtone)

\c[-------------Control characters

\d-------------match non-numeric characters

\d-------------Match Numeric characters

\e-------------Enable pattern metacharacters

\e-------------Escape

\f-------------page Break

\l-------------Lowercase until \e is encountered

\l-------------Lowercase Next character

\ n-------------Line break

\q-------------Reference (Forbidden) pattern metacharacters until \e is encountered

\ r-------------Enter

\s-------------match non-whitespace characters

\s-------------match whitespace characters

-T-------------tab stop

\u-------------Uppercase until \e is encountered

\u-------------Uppercase Next character

\w-------------match non-word characters

\w-------------match a word character (alphanumeric characters and "_")

\XL-------------16 binary characters

ps:\w matches only one alphanumeric character, not a word, in order to match a word, you need to use \w+ (match one or more):

print "\ n--------------------------(\w+) used-----------------------------\ n";
$text = "Prettyyoung.";
Print "$text _young\n";
$text =~s/\w+/there/;
print "$text \ n";
print "\ n------------------------END (\w+) used---------------------------\ n";


Match any character: '. '. This character can match any character, except for newline characters (but if the s character modifier is used with m//and s///, the period character and line break match)

Replace all characters in a string with a *, and the G modifier allows the substitution operation to be performed at global scope.

print "\ n--------------------------(.) Used-----------------------------\ n ";
$text = "Prettyyoung.";
print "$text \ n";
$text =~s/./*/g;
print "$text \ n";
print "\ n------------------------END (.) Used---------------------------\ n ";


A character such as a period regular in an expression called metacharacters (Metacharacters include: \| ()[{^$*+?. ), just precede them with a backslash to ensure that it is interpreted literally, not as a meta-character.

' ^ ': matches the beginning of the line, letting the user know that the sentence should not start with a period.

print "\ n--------------------------(^) used-----------------------------\ n";
$text = ". Prettyyoung. ";
print "$text \ n";
if ($text =~m/^\./) {
print "should ' t start a sentence with a period!";
}
print "\ n------------------------END (^) used---------------------------\ n";


Remove comments from C code by using * quantifiers and. To represent any number of similar characters to match all characters between the delimiter/* and/*.

print "\ n--------------------------(* and.) Used-----------------------------\ n ";
$text = "Count++;/*increment count*/";
$text =~s/\/\*.*\*\///g;


or use

$text =~s|\/\*.*\*\/| | G;
Print $text;
print "\ n------------------------END (* and.) Used---------------------------\ n ";


You can use more than one character to compose a character class, and that class will match any of these characters. The character class is to be included in the [character class]. You can also use the-character to specify a range of characters

print "\ n--------------------------([Zi fu Yuan zu]) used-----------------------------\ n";
$text = "Count++;/*increment count*/";
if ($text =~/[couiite]/) {
print "Yep,we got vowels.\n";
}
print "\ n------------------------END ([Zi fu Yuan zu]) used---------------------------\ n";


If you use ^ as the first character in a character class, that character class will match any characters that are not in it, and in the following example, only characters that are not letters or whitespace are matched:

print "\ n--------------------------([^a-za-z\s]+) used-----------------------------\ n";
$text = "count200 Increment count";
$text =~s/[^a-za-z\s]+/521/;
Print $text;
print "\ n------------------------END ([^a-za-z\s]+) used---------------------------\ n";


Extracts all lowercase words in the $_ and stores them in the new array @a:

print "\ n--------------------------(\b[^a-z]+\b) used-----------------------------\ n";
$_= "Hereis the text";
@a=~m/\b[^a-z]+\b/g;
Print @a;
print "\ n--------------------------End (\b[^a-z]+\b) used-----------------------------\ n";


PS: Match word boundaries with \b

By using a specific character or sequence of characters as a literal or character class in a regular expression, you can match a specific character or sequence of characters.

Multiple-match mode: You can specify a series of options for the mode and separate the options by |. For example, you can check that user input is "exit", "Quit", "Stop"

print "\ n--------------------------(|) Used-----------------------------\ n ";
 Print "Enter exit|quit|stop:";
 while (<>) {
 if (m/exit|quit|stop/) {exit;}
 }
 print "\ n--------------------------END (|) Used--------------------------\ n ";
 
 print "\ n--------------------------(|) Used-----------------------------\ n ";
 Print "Enter exit|quit|stop:";
 while (<>) {
 if (m/^ (exit|quit|stop) $/) {exit;}
 }
 print "\ n--------------------------END (|) Used--------------------------\ n ";

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.