Metacharacters |
Description |
\ |
Mark the next character, or a backward reference, or an octal escape character. For example, "\ n" matches \ n. "\ N" matches the line break. The sequence "\" matches "\", and "\ (" matches "(". It is equivalent to the concept of "Escape Character" in multiple programming languages. |
^ |
Match the beginning of the input line. If the Multiline attribute of the RegExp object is set, ^ matches the position after "\ n" or "\ r. |
$ |
Matches the end of the input line. If the Multiline attribute of the RegExp object is set, $ also matches the position before "\ n" or "\ r. |
* |
Match the previous subexpression any time. For example, zo * can match "z", "zo", and "zoo ". * Equivalent to o {0 ,} |
+ |
Match the previous subexpression once or multiple times (greater than or equal to 1 time ). For example, "zo +" can match "zo" and "zoo", but cannot match "z ". + Is equivalent to {1 ,}. |
? |
Match the previous subexpression zero or once. For example, "do (es )?" It can match "do" in "do" or "does ".? It is equivalent to {0, 1 }. |
{N} |
NIs a non-negative integer. MatchedNTimes. For example, "o {2}" cannot match "o" in "Bob", but can match two o in "food. |
{N,} |
NIs a non-negative integer. At least matchNTimes. For example, "o {2,}" cannot match "o" in "Bob", but can match all o in "foooood. "O {1,}" is equivalent to "o + ". "O {0,}" is equivalent to "o *". |
{N,M} |
MAndNAll are non-negative integers, whereN<=M. Least matchNTimes and most matchingMTimes. For example, "o {1, 3}" matches the first three o in "fooooood" as a group, and the last three o as a group. "O {0, 1}" is equivalent to "o ?". Note that there must be no space between a comma and two numbers. |
? |
When this character is followed by any other delimiter (*, + ,?, {N},{N,},{N,M}) The matching mode is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible. For example, for the string "oooo", "o +" will match "o" as much as possible to get the result ["oooo"], while "o + ?" Match "o" as few as possible. The result is ['O', 'O']. |
. |
Match any single character except "\ n. To match any character including "\ n", use a pattern like "[\ s \ S. |
(Pattern) |
Match pattern and obtain this match. The obtained match can be obtained from the generated Matches set. The SubMatches set is used in VBScript, and $0… is used in JScript... $9 attribute. To match the parentheses, use "\ (" or "\)". |
(? : Pattern) |
If the match is not obtained, it matches pattern but does not obtain the matching result. It is not stored for future use. This is useful when the character "(|)" is used to combine all parts of a pattern. For example, "industr (? : Y | ies) "is a simpler expression than" industry | industrial. |
(? = Pattern) |
If the match is not obtained, it is pre-checked in the forward direction and matches the search string at the beginning of any string that matches the pattern. This match does not need to be obtained for future use. For example (? = 95 | 98 | NT | 2000) "can match" Windows "in" Windows2000 ", but cannot match" Windows "in" Windows3.1 ". Pre-query does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters. |
(?! Pattern) |
Non-get match, forward negative pre-query, match the search string at the beginning of any string that does not match pattern, this match does not need to be obtained for future use. For example, "Windows (?! 95 | 98 | NT | 2000) "can match" Windows "in" Windows3.1 ", but cannot match" Windows "in" Windows2000 ". |
(? <= Pattern) |
Non-get matching, reverse certainly pre-query, similar to positive certainly pre-query, but in the opposite direction. For example, <= 95 | 98 | NT | 2000) Windows can match Windows in 2000Windows, but cannot match Windows in 3.1Windows ". |
(? <! Pattern) |
Non-get match, reverse negative pre-query, similar to forward negative pre-query, only in the opposite direction. For example, "(? <! 95 | 98 | NT | 2000) Windows can match "Windows" in "3.1Windows", but cannot match "Windows" in "2000Windows ". This location is incorrect. If there is a problem, either the use or any item cannot exceed two places, for example, "(? <! 95 | 98 | NT | 20) Windows is correct, "(? <! 95 | 980 | NT | 20) Windows reports an error. If it is used independently, there is no restriction, such (? <! 2000) Windows matching |
X | y |
Match x or y. For example, "z | food" can match "z" or "food" (Be cautious here ). "[Zf] ood" matches "zood" or "food ". |
[Xyz] |
Character Set combination. Match any character in it. For example, "[abc]" can match "a" in "plain ". |
[^ Xyz] |
Negative value character set combination. Match any character not included. For example, "[^ abc]" can match "plin" in "plain ". |
[A-z] |
Character range. Matches any character in the specified range. For example, "[a-z]" can match any lowercase letter in the range of "a" to "z. Note: only when a hyphen is in a character group and appears between two characters can the range of the characters be expressed. If a group starts with a hyphen, it can only represent the character itself. |
[^ A-z] |
Negative character range. Matches any character that is not within the specified range. For example, "[^ a-z]" can match any character that is not in the range of "a" to "z. |
\ B |
Match A Word boundary, that is, the position between a word and a space (that is, the regular expression "match" has two concepts: matching characters and matching positions, here \ B is the matching position ). For example, "er \ B" can match "er" in "never", but cannot match "er" in "verb ". |
\ B |
Match non-word boundary. "Er \ B" can match "er" in "verb", but cannot match "er" in "never ". |
\ Cx |
Match the control characters specified by x. For example, \ cM matches a Control-M or carriage return character. The value of x must be either a A-Z or a-z. Otherwise, c is treated as a literal "c" character. |
\ D |
Match a numeric character. It is equivalent to [0-9]. -P and perl regular expressions must be added to grep. |
\ D |
Match a non-numeric character. It is equivalent to [^ 0-9]. -P and perl regular expressions must be added to grep. |
\ F |
Match a form feed. It is equivalent to \ x0c and \ cL. |
\ N |
Match A linefeed. It is equivalent to \ x0a and \ cJ. |
\ R |
Match a carriage return. It is equivalent to \ x0d and \ cM. |
\ S |
Match any invisible characters, including spaces, tabs, and page breaks. It is equivalent to [\ f \ n \ r \ t \ v]. |
\ S |
Match any visible characters. It is equivalent to [^ \ f \ n \ r \ t \ v]. |
\ T |
Match a tab. It is equivalent to \ x09 and \ cI. |
\ V |
Match a vertical tab. It is equivalent to \ x0b and \ cK. |
\ W |
Match any word characters that contain underscores. Similar to but not equivalent to "[A-Za-z0-9 _]", here the "word" character uses the Unicode Character Set. |
\ W |
Match any non-word characters. It is equivalent to "[^ A-Za-z0-9 _]". |
\ XN |
MatchN, WhereNIt is a hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, "\ x41" matches "". "\ X041" is equivalent to "\ x04 & 1 ". The regular expression can be ASCII encoded. |
\Num |
MatchNum, WhereNumIs a positive integer. References to the obtained matching. For example, "(.) \ 1" matches two consecutive identical characters. |
\N |
Identifies an octal escape value or a backward reference. If \NAt leastNObtained subexpressionsNIs backward reference. Otherwise, ifNIs an octal digit (0-7 ),NIt is an octal escape value. |
\Nm |
Identifies an octal escape value or a backward reference. If \NmAt leastNmTo obtain the subexpressionNmIs backward reference. If \NmAt leastNNIs followed by textM. If none of the preceding conditions are metNAndMAll are Octal numbers (0-7), then \NmMatch the octal escape ValueNm. |
\Nml |
IfNIt is an octal digit (0-7) andMAndLIf the values are Octal numbers (0-7), the octal escape value is matched.Nml. |
\ UN |
MatchN, WhereNIt is a Unicode character represented by four hexadecimal numbers. For example, \ u00A9 matches the copyright symbol (& copy ;). |
\ P {P} |
Lowercase p indicates the Unicode attribute, which is used as the prefix of the Unicode positive expression. The "P" in the brackets represents one of the seven character attributes of the Unicode Character Set: punctuation. The other six attributes are: L: letters, M: markup symbols (usually not appear separately), Z: separators (such as spaces, line breaks, etc.), and S: symbols (such as mathematical symbols and currency symbols); N: Numbers (such as Arabic numerals and Roman numerals); C: other characters.* Note: This syntax is not supported in some languages, for example, javascript. |
\ <\> |
Start (\ <) and end (\>) of the match word (word ). For example, the regular expression \ <the \> can match the "the" in the string "for the wise", but cannot match the "the" in the string "otherwise ". Note: This metacharacter is not supported by all software. |
() |
Defines the expressions between (and) as "group" and saves the characters matching the expression to a temporary region (a regular expression can save up to 9 characters ), they can be referenced using symbols from \ 1 to \ 9. |
| |
Perform logical "Or" (Or) operations on the two matching conditions. For example, the regular expression (him | her) matches "it belongs to him" and "it belongs to her", but does not match "it belongs to them .". Note: This metacharacter is not supported by all software. |