Dynamic | web | A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (called metacharacters). This pattern describes one or more strings to be matched when looking for a text body. A regular expression is used as a template to match a character pattern with the string being searched for. Such as:
The following is a complete list of metacharacters and its behavior in the context of regular expressions:
\
Marks the next character as a special character, or a literal character, or a backward reference, or a octal escape character. For example, ' n ' matches the character ' n '. ' \ n ' matches a newline character. Sequence ' \ ' matches ' \ ' and ' \ (' Matches ' (".
^
Matches the start position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '.
$
Matches the end position of the input string. If the Multiline property of the RegExp object is set, the $ also matches the position before ' \ n ' or ' \ R '.
*
Matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * is equivalent to {0,}.
+
Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but cannot match "Z". + is equivalent to {1,}.
?
Match the preceding subexpression 0 times or once. For example, "Do (es)" can match "do" in "do" or "does". is equivalent to {0,1}.
N
n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,}
n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ' but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{N,m}
M and n are nonnegative integers, of which n <= M. Matches n times at least and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' o '. Notice that there is no space between the comma and the two number.
?
When the character is immediately following any of the other qualifiers (*, +,?, {n}, {n,}, {n,m}), the matching pattern is not greedy. Non-greedy patterns match as few strings as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", and ' o+ ' will match all ' o '.
.
Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
(pattern)
Match pattern and get this match. The obtained matches can be obtained from the resulting matches collection, use the Submatches collection in VBScript, and use the $0...$9 property in JScript. To match the parentheses character, use ' \ (' or ' \ ').
(?:p Attern)
Matches pattern but does not get a matching result, which means it is a non fetch match and is not stored for later use. This is useful for combining parts of a pattern with the "or" character (|). For example, ' Industr (?: y|ies) is a more abbreviated expression than ' industry|industries '.
(? =pattern)
Forward lookup, matching the find string at the beginning of any string matching pattern. This is a non-fetch match, that is, the match does not need to be acquired for later use. For example, ' Windows (? =95|98| nt|2000) ' Can match windows in Windows 2000, but cannot match windows in Windows 3.1. It does not consume characters, that is, after a match occurs, the next matching search begins immediately after the last match, instead of starting after the character that contains the pre-check.
(?! Pattern
Negative pre-check, matches the lookup string at the beginning of any mismatched pattern string. This is a non-fetch match, that is, the match does not need to be acquired for later use. For example, ' Windows (?! 95|98| nt|2000) ' Can match windows in Windows 3.1, but cannot match windows in Windows 2000. It does not consume characters, that is, after a match occurs, the next matching search begins immediately after the last match, instead of starting after the character that contains the pre-check.
X|y
Match x or Y. For example, ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches ' zood ' or ' food '.
[XYZ]
Character set combination. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '.
[^XYZ]
Negative character set combination. Matches any characters that are not included. For example, ' [^ABC] ' can match ' P ' in ' plain '.
[A-z]
The range of characters. Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the range ' a ' to ' Z '.
[^a-z]
Negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character that is not in the range of ' a ' to ' Z '.
\b
Matches a word boundary, which is the position between the word and the space. For example, ' er\b ' can match ' er ' in ' never ', but cannot match ' er ' in ' verb '.
\b
Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\cx
Matches the control character indicated by X. For example, \cm matches a control-m or carriage return character. The value of x must be one-a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\d
Matches a numeric character. equivalent to [0-9].
\d
Matches a non-numeric character. equivalent to [^0-9].
\f
Matches a page feed character. Equivalent to \x0c and \CL.
\ n
Matches a line feed character. Equivalent to \x0a and \CJ.
\ r
Matches a carriage return character. Equivalent to \x0d and \cm.
\s
Matches any white space character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s
Matches any non-white-space character. equivalent to [^ \f\n\r\t\v].
\ t
Matches a tab character. Equivalent to \x09 and \ci.
\v
Matches a vertical tab. Equivalent to \x0b and \ck.
\w
Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w
Matches any non word character. Equivalent to ' [^a-za-z0-9_] '.
\xn
Matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be a determined two digits long. For example, ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. You can use ASCII encoding in regular expressions ...
\num
Matches num, where num is a positive integer. A reference to the match that was obtained. For example, ' (.) \1 ' matches two consecutive identical characters.
\ n
Identifies a octal escape value or a backward reference. n is a backward reference if you have at least n obtained subexpression before \ nthe. Otherwise, if n is an octal number (0-7), then N is an octal escape value.
\nm
Identifies a octal escape value or a backward reference. NM is a backward reference if at least NM has obtained the subexpression before \nm. If there are at least N fetches before \nm, then n is a backward reference followed by a literal m. If all the preceding conditions are not satisfied, if both N and M are octal digits (0-7), then \nm will match octal escape value nm.
\nml
If n is an octal number (0-3) and both M and L are octal digits (0-7), the octal escape value NML is matched.
\un
Matches n, where N is a Unicode character represented in four hexadecimal digits. For example, \u00a9 matches the copyright symbol (©).
Let's look at a few examples:
"^the": denotes all strings starting with "the" ("There", "the Cat", etc.);
"Of despair$": a string representing the end of "of despair";
"^abc$": means that the beginning and end are "abc" string-hehe, only "ABC" itself;
"Notice": Represents any string containing "notice".
' * ', ' + ' and '? ' The three symbols that represent the number of occurrences of one or a sequence of characters. They represent "no or
More "," one or more "and" No or once ". Here are a few examples:
"ab*" means that a string has one followed by 0 or several B. ("A", "AB", "ABBB",...... );
"ab+" means that a string has a second followed by at least one B or more;
"Ab?" : Indicates that a string has a followed 0 or a B;
"a?b+$": indicates that there are 0 or one a followed by one or several B at the end of the string.
You can also use a range, enclosed in braces, to indicate the range of repetitions.
"Ab{2}": Indicates that a string has a followed 2 B ("ABB");
"Ab{2,}": Indicates that a string has a a followed by at least 2 B;
"ab{3,5}": Indicates that a string has a followed 3 to 5 B.
Please note that you must specify the lower bound of the range (for example: "{0,2}" instead of "{2}"). Also, you may have noticed, ' * ', ' + ' and
'?' Equivalent to "{0,}", "{1,}" and "{0,1}".
There is also a ' ¦ ', which indicates the "or" action:
"Hi¦hello": means "hi" or "Hello" in a string;
"(B¦CD) EF": means "bef" or "cdef";
"(a¦b) *c": Represents a string of "a" "B" mixed strings followed by a "C";
'.' You can override any character:
"A.[0-9]": Indicates that a string has a "a" followed by an arbitrary character and a number;
"^. {3}$ ": A string representing any three characters (3 characters in length);
The square brackets indicate that certain characters are allowed to appear at a specific position in a string:
"[AB]": Indicates that a string has a "a" or "B" (equivalent to "a¦b");
[A-d]: Indicates that a string contains one of the lowercase ' a ' to ' d ' (equivalent to "a¦b¦c¦d" or "[ABCD]");
"^[a-za-z]": Represents a string that begins with a letter;
"[0-9]%": Indicates a digit before a percent semicolon;
", [a-za-z0-9]$": Indicates that a string ends with a comma followed by a letter or number.
You can also use ' ^ ' in square brackets to indicate a character that you do not want to appear, ' ^ ' should be first in square brackets. (such as: "%[^a-za-z]%" table
The letter should not appear in the two percent sign.
In order to express verbatim, must be in the "^.$ () ¦*+?" {\ ' Precede these characters with the transfer character ' \ '.
Note that in square brackets, you do not need an escape character.