The first metacharacter is a dot (.). In regular expressions, dots are used to match any single character except line breaks. The simplest wildcard character is + metacharacters. + Match the preceding and subsequent characters at least once or at any time. Metacharacters * allow the first character to be matched 0 times or multiple times. Metacharacters? It is used to make the previous character 0 or match once (but cannot exceed once). * is a commonly used wildcard in regular expressions .*. It can be used to match anything p a t {n, m }. Here, n is the minimum number of matching times, M is the maximum number of matching times, and p a t is the character or character group you are trying to quantify the matching. /X {5, 10}/X appears at least 5 times, but not more than 1 0 times.
/X {9,}/X appears at least 9 times, or more times.
/X {}/X can appear up to four times, or it may not appear at all.
/X {8}/X must appear exactly eight times, not more or less.
-
Perl's regular expressions have such a tool called a character class.
-
[ ABCDE] Is used to match any character in A, B, C, D, or E.
[A-E] is the same as above. Used to match any character in A, B, C, D, or E
G is used to match uppercase or lowercase letters G.
[0-9] is used to match a number.
[0-9] + is used to sequentially match one or more numbers.
[A-Za-Z] {5} is used to match any group of 5 letter characters
[*! @ # $ % & ()] Is used to match any of these symbols
-
The last example is very interesting, because in character classes, most wildcards will lose their "wildcard nature". In other words, their running features will be similar to any other common character. Therefore, * actually represents a common * character.
-
If the caret (^) is inserted as the first character in the character class, the character class will become invalid. That is to say, this character class can match any single character that is not in this character class. The following is an example:
-
/[^ A-Z]/indicates matching any character other than a-Z.
P e r l contains shortcuts for some common character classes. They are represented by backslash and wildcard characters, as shown in Table 6-2.
The following are some examples:
/\ D {5}/match five digits/\ s \ W + \ s/match a group of characters surrounded by white space
Pattern used for matching
\ W a word character, same as [A-z A-Z 0-9 _]
\ W is a non-word character (opposite to \ W)
\ D is a number, which is the same as [0-9 ].
\ D a non-digit
\ S is a white space character, which is the same as [\ t \ f \ r \ n]
\ S a non-white space character
-
In the context of the list, the matching operator returns a list of all parts of the expression matching in parentheses. Each value in parentheses is the return value of the list. If the pattern does not contain parentheses, 1 is returned. See the following example:
-
$ _ = "Apple is red ";
-
($ Fruit, $ color) =/(. *) \ SIS (. *)/; # note that = is used for matching, rather than = ~.
-
($ Fruit, $ color) =_ _ = ~ M/(. *) \ SIS (. *)/; # Or you can write it like this. I don't know why
-
-
The last two wildcards (You may think that the wildcard is not checked) are location wildcards.
-
/^ H e L p/only matches rows starting with h e l p
/^ F r a n k l y. * d a r n $/is only used to match rows starting with f r a n k l y and ending with d a r n. All characters in the middle of them are also matched.
/^ H y s t e r I a $/is used only to match rows that contain only the word h y S T E R I
/^ $/Is only used to match the beginning of a row, followed by the end of the row. It is only used to match empty rows.
/^/Is only used to match rows with the starting characters (all rows ). /$/Has the same effect.
Replace operator S // $ ch2name = ~ S/0 \. wav/1 \. wav/; modifier and multi-match replacement operator (S //) and matching operator (M //) can not consider uppercase or lowercase letters when matching a regular expression, if the match is followed by the letter I. Another modifier for/luckydog/I to match and replace is the global match modifier G. The matching operation of a regular expression (or replacement) is not completed at one time. It must repeat the entire string. After the first match, it will immediately perform the next match (or replacement ). In the context of the list, the global match modifier enables matching. Code Returns the list of each part of a regular expression in parentheses: $ _ = "one fish, two frog, three Fred, red foul "; @ F = m/\ W (f \ W)/g; this mode first matches a non-word character, then matches the letter F, and then matches four word characters. The letter F and four words are grouped in parentheses. After the expression is calculated, the variable @ F contains four elements, namely, f I s h, F R o G, F R E d, and f o u L. A common operation in p e r l is to search for arrays and find certain modes. P e r l has a special function that can be used to perform this operation. This function is called g r e p. The syntax of the g r e p function is as follows: grep expression, list