First, Introduction
Two, matching operator
Iii. special characters in the pattern
1, character +
2, characters [] and [^]
3, character * and?
4, escape character
5. Match any letter or number
6, Anchor mode
7, variable substitution in the pattern
8, character range escape prefix
9. Match any character
10, matching the specified number of characters
11. Specify options
12. Partial reuse of patterns
13. Escape and order of execution of specific characters
14. Specify Pattern delimiter
15. Mode Order Variable
Mode-matching options
1. Match all possible modes (g option)
2. Ignore case (i option)
3, the string as multiple lines (M option)
4. Perform only one variable substitution example
5. Consider a string as a single case
6. Ignore spaces in the pattern
V. Substitution operator
Vi. translation Operators
Vii. Extended Pattern Matching
1. Do not store the matching contents in brackets
2. Inline mode option
3, affirmative and negative foresight match
4, Mode annotation
First, Introduction
Pattern refers to the character of a particular sequence that is searched for in a string, which is included by a backslash:/def/, Mode def. Its usage, such as combining function split, splits a string into multiple words in a pattern: @array = Split (//, $line);
Two, matching operator =~,!~
=~ Verify that the match was successful: $result = $var =~/abc/; If the pattern is found in the string, it returns a value other than 0, true, or 0, or false, if it does not match.!~ is the opposite.
These two operators are suitable for conditional control, such as:
if ($question =~/please/) {
Print ("Thank for being polite!\n");
}
else {
Print ("That is not very polite!\n");
}
Iii. special characters in the pattern
Perl supports special characters in the pattern and can play a special role.
1, character +
+ means one or more of the same characters, such as:/de+f/-Def, Deef, Deeeeef, etc. It matches as many of the same characters as possible, as/ab+/in the string ABBC will be ABB, not AB.
When there are more than one space between the words in a line, you can split the following:
@array = Split (/+/, $line);
Note: The Split function always starts a new word every time it encounters a split pattern, so if $line begins with a space, the first element of @array is an empty element. But it can distinguish whether there are really words, if $line only space, then @array is an empty array. and the tab character in the previous example is treated as a word. Pay attention to corrections.
2, characters [] and [^]
[] means matching one of a set of characters, such as/a[0123456789]c/will match a plus number plus C's string. Combined with + example:/d[ee]+f/matching Def, Def, Deef, DEDF, Deeeeeeeef, etc. ^ represents all except its characters, such as:/d[^dee]f/matches a string of D plus non-e characters alphanumeric F.
3, character * and?
They are similar to +, except that they match 0, one or more of the same characters, and match 0 or one of the characters. such as/de*f/matching DF, Def, Deeeef,/de?f/matching DF or def.
4, escape character
If you want to include characters that are usually considered special in the pattern, you must add a slash before it. For example: in/\*+/, \* denotes the character *, not the meaning of one or more characters mentioned above. The slash is expressed as/\\/. The \q and \e are escaped with characters available in PERL5.
5. Match any letter or number
The above mentioned pattern/a[0123456789]c/matches the string with the letter a plus any number plus C, and the other means:/a[0-9]c/, similarly, [A-z] denotes any lowercase letter, [a-z] denotes any uppercase letter. Any uppercase and lowercase letters, numbers are represented by:/[0-9a-za-z]/.
6, Anchor mode
Anchor description
^ or \a only match string heads
$ or \z only match string tail
\b Match word boundaries
\b Word Internal match
Example 1:/^def/only matches a string that begins with Def,/$def/matches only a string at the end of Def, and the combined/^def$/matches only the string def (?). \a and \z are different from ^ and $ when matching multiple lines.
Example 2: Verify the type of the variable name:
if ($varname =~/^\$[a-za-z][_0-9a-za-z]*$/) {
Print ("$varname is a legal scalar variable\n");
} elsif ($varname =~/^@[a-za-z][_0-9a-za-z]*$/) {
Print ("$varname is a legal array variable\n");
} elsif ($varname =~/^[a-za-z][_0-9a-za-z]*$/) {
Print ("$varname is a legal file variable\n");
} else {
Print ("I dont understand what $varname is.\n");
}
Example 3:\b matches the word boundary:/\bdef/matches def and Defghi words with Def, but does not match abcdef. /def\b/matches def and abcdef words at the end of Def, but does not match defghi,/\bdef\b/matches only String def. Note:/\bdef/can match $defghi, because $ is not considered part of the word.
Example 4:\b in the word internal matching:/\bdef/matching abcdef, but not matching def;/def\b/matching defghi,/\bdef\b/matching CDEFG, Abcdefghi, but do not match def,defghi,abcdef.
7, variable substitution in the pattern
Divide a sentence into words:
$pattern = "[\\t]+";
@words = Split (/$pattern/, $line);
8. Character Range escape
E Escape Character Description range
\d any number [0-9]
\d any character except a number [^0-9]
\w any word characters [_0-9a-za-z]
\w any non-word characters [^_0-9a-za-z]
\s Blank [\r\t\n\f]
\s not blank [^ \r\t\n\f]
Example:/[\da-z]/matches any number or lowercase letter.
9. Match any character
Character "." Matches all characters except newline, usually with *.
10, matching the specified number of characters
The character pair {} Specifies the number of occurrences of the matched character. For example:/de{1,3}f/matching Def,deef and deeef;/de{3}f/matching deeef;/de{3,}f/match not less than 3 E between D and F;/de{0,3}f/matches no more than 3 E between D and F.
11. Specify options
Character "|" Specifies two or more selections to match the pattern. such as:/def|ghi/matching Def or ghi.
Example: Verifying the legality of numbers
if ($number =~/^-?\d+$|^-?0[xx][\da-fa-f]+$/) {
Print ("$number is a legal integer.\n");
} else {
Print ("$number is not a legal integer.\n");
}
where ^-?\d+$ matches decimal digits, ^-?0[xx][\da-fa-f]+$ matches hexadecimal digits.
12. Partial reuse of patterns
When the same part of the pattern appears multiple times, enclose it in parentheses and refer to it multiple times to simplify the expression:
/\d{2} ([\w]) \d{2}\1\d{2}/match:
12-05-92
26.11.87
07 04 92 etc
Note: the/\d{2} ([\w]) \d{2}\1\d{2}/differs from/(\d{2}) ([\w]) \1\2\1/, which matches only strings in the form of 17-17-17, and does not match 17-05-91.
13. Escape and order of execution of specific characters
As with operators, escape and specific characters also have an order of execution:
Special Character description
() mode memory
+ * ? {} Number of occurrences
^ $ \b \b Anchor
| Options
14. Specify Pattern delimiter
By default, the pattern delimiter is a backslash/, but it can be specified by its own letter m, such as:
m!/u/jqpublic/perl/prog1! Equivalent to/\/u\/jqpublic\/perl\/prog1/
Note: When using letters as delimiters, do not make variable substitution, when special characters as delimiters, its escape function or special function is not used.
15. Mode Order Variable
The result of invoking the reused part after the pattern match can be $n with the variable, and all the results are $& with the variable.
$string = "This string contains the number 25.11.";
$string =~/-? (\d+) \.? (\d+)/; # match result is 25.11
$integerpart = $; # now $integerpart = 25
$decimalpart = $; # now $decimalpart = 11
$totalpart = $&; # now Totalpart = 25.11
Mode-matching options
Option description
G Match all possible patterns
I ignore case
M treats strings as multiple lines
o only assign one value at a time
s treats a string as a single line
x ignores whitespace in the pattern
1. Match all possible modes (g option)
@matches = "Balata" =~/.a/g; # now @matches = ("ba", "La", "ta")
Matching loops:
while ("Balata" =~/.a/g) {
$match = $&;
Print ("$match \ n");
}
The results are:
Ba
La
Ta
When option g is used, a function pos is available to control the next matching offset:
$offset = pos ($string);
POS ($string) = $newoffset;
2. Ignore case (i option)
/de/i match De,de,de and de.
3, the string as multiple lines (M option)
In this case, the ^ symbol matches the start of the string or the beginning of a new line, and the $ symbol matches the end of any line.
4. Perform only one variable substitution example
$var = 1;
$line =;
while ($var < 10) {
$result = $line =~/$var/O;
$line =;
$var + +;
}
Match/1/each time.
5. Consider a string as a single case
/A.*BC/S matches the string axxxxx \NXXXXBC, but/a.*bc/does not match the string.
6. Ignore spaces in the pattern
/\d{2} ([\w]) \d{2} \1 \d{2}/x equivalent to/\d{2} ([\w]) \d{2}\1\d{2}/.
V. Substitution operator
The syntax is s/pattern/replacement/, and the effect is to replace the part in the string with the pattern in replacement. Such as:
$string = "Abc123def";
$string =~ s/123/456/; # now $string = "Abc456def";
You can use the pattern order variable $n in the replacement section, such as s/(\d+)/[$1]/, but special characters that do not support the pattern in the replacement section, such as {},*,+, etc., such as s/abc/[def]/will replace ABC with [DEF].
The options for the Replace operator are as follows:
Option description
G Change all matches in the pattern
I ignore capitalization in the pattern
E substitution string as an expression
M treats the string to be matched as multiple rows
o Only assign one time
s treats the string to be matched as a single line
x ignores whitespace in the pattern
Note: The E option considers the replacement part of the string as an expression and evaluates its value before replacing it, such as:
$string = "0ABC1";
$string =~ s/[a-za-z]+/$& x 2/e; # now $string = "0ABCABC1"
Vi. translation Operators
This is another way to replace the syntax: tr/string1/string2/. Similarly, string2 is the replacement part, but the effect is to replace the first character in the string1 with the first character in the string2, replace the second character in the string1 with the second character in the string2, and so on. Such as:
$string = "ABCDEFGHICBA";
$string =~ tr/abc/def/; # now String = ' defdefghifed '
When string1 is longer than string2, its extra characters are replaced with the last character of String2, and the first substitution character is used when the same character occurs more than once in string1.
The options for translation operators are as follows:
Option description
C Translation of all unspecified characters
d Delete all specified characters
s indents multiple identical output characters into one
such as $string =~ tr/\d//C; Replace all non-numeric characters with spaces. $string =~ tr/\t//d, Remove tab and space, $string =~ tr/0-9//cs, and replace other characters between numbers with a single space.
Vii. Extended Pattern Matching
Perl supports some of the pattern-matching capabilities that PERL4 and standard UNIX pattern matching operations do not have. Its syntax is: (? pattern), where C is a character, patterns are the mode or sub pattern that works.
1. Do not store the matching contents in brackets
In Perl mode, the child mode in parentheses is stored in memory, which cancels the storage of the matches within the brackets, such as the \1 in the/(?: A|b|c) (D|e) f\1/that represents the matched D or E, rather than a or B or C.
2. Inline mode option
Typically, after the mode option is placed, there are four options: I, M, s, x can be used inline, syntax is:/(? option) pattern/, equivalent to/pattern/option.
3, affirmative and negative foresight match
The affirmative preview matching syntax is/pattern (? =string)/, whose meaning matches the pattern followed by string, instead, (?!). string) meaning to match a pattern that is not followed by string, such as:
$string = "25abc8";
$string =~/abc (? =[0-9])/;
$matched = $&; # $& is the matching pattern, here for ABC, not ABC8
4, Mode annotation
In PERL5, you can use the #来加注释 in a pattern, such as:
if ($string =~/(? i) [a-z]{2,3} (? # match two or three alphabetic characters)/{
...
}
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.