Regular expression syntax
A regular expression is a text pattern consisting of ordinary characters, such as characters A through z, and special characters (called metacharacters). This pattern describes one or more strings to match when looking up a text body. A regular expression, as a template, matches a character pattern to the string you are searching for.
Here are some examples of regular expressions that you might encounter:
JScript VBScript Matching
/^\[\t]*$/"^\[\t]*$" matches a blank line.
/\d{2}-\d{5}/"\d{2}-\d{5}" verifies whether an ID number consists of a 2-digit number, a hyphen, and a 5-digit number.
/< (. *) >.*<\/\1>/"< (. *) >.*<\/\1>" matches an HTML tag.
The following table is a complete list of metacharacters and its behavior in the context of regular expressions:
Character description
\ marks the next character as a special character, or a literal character, or a backward reference, or an octal escape character. For example, ' n ' matches the character "n". ' \ n ' matches a line break. The sequence ' \ ' matches "\" and "\ (" Matches "(".
^ matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '.
$ matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before ' \ n ' or ' \ R '.
* matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
? Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to {0,1}.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
? When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '.
. Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
Pattern matches the pattern and gets the match. The obtained matches can be obtained from the resulting Matches collection, the Submatches collection is used in VBScript, and the $0...$9 property is used in JScript. To match the parentheses character, use ' \ (' or ' \ ').
(? attern) matches the pattern but does not get a matching result, which means that this is a non-fetch match and is not stored for later use. This is useful when using the "or" character (|) to combine parts of a pattern. For example, ' Industr (?: y|ies) is a more abbreviated expression than ' industry|industries '.
(? =pattern) forward, matching the lookup string at the beginning of any string that matches the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example, ' Windows (? =95|98| nt|2000) ' Can match Windows 2000 ', but does not match Windows 3.1 in Windows. Pre-checking does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check.
(?! pattern), which matches the lookup string at the beginning of any string that does not match the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example ' Windows (?! 95|98| nt|2000) ' can match Windows 3.1 ', but does not match Windows 2000 in Windows. Pre-check does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check
X|y matches x or Y. For example, ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches "Zood" or "food".
[XYZ] Character set. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '.
[^XYZ] negative character set. Matches any character that is not contained. For example, ' [^ABC] ' can match ' P ' in ' plain '.
A [A-z] character range. Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the ' a ' to ' Z ' range.
[^a-z] negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character that is not within the range of ' a ' to ' Z '.
\b Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.
\b Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\CX matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\d matches a numeric character. equivalent to [0-9].
\d matches a non-numeric character. equivalent to [^0-9].
\f matches a page break. Equivalent to \x0c and \CL.
\ n matches a line break. Equivalent to \x0a and \CJ.
\ r matches a carriage return character. Equivalent to \x0d and \cm.
\s matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t matches a tab character. Equivalent to \x09 and \ci.
\v matches a vertical tab. Equivalent to \x0b and \ck.
\w matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w matches any non-word character. Equivalent to ' [^a-za-z0-9_] '.
\XN matches N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long for a determination. For example, ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular expressions:
\num matches num, where num is a positive integer. A reference to the obtained match. For example, ' (.) \1 ' matches two consecutive identical characters.
\ n identifies an octal escape value or a backward reference. n is a backward reference if \ n is preceded by at least one of the sub-expressions obtained. Otherwise, if n is the octal number (0-7), N is an octal escape value.
\NM identifies an octal escape value or a backward reference. If at least NM has obtained a subexpression before \nm, then NM is a backward reference. If there are at least N fetches before \nm, then n is a backward reference followed by the literal m. If none of the preceding conditions are met, if both N and M are octal digits (0-7), then \nm will match the octal escape value nm.
\NML if n is an octal number (0-3) and both M and L are octal digits (0-7), the octal escape value NML is matched.
\un matches N, where N is a Unicode character represented by four hexadecimal digits. For example, \u00a9 matches the copyright symbol (?).
http://www.bkjia.com/PHPjc/314710.html www.bkjia.com true http://www.bkjia.com/PHPjc/314710.html techarticle Regular expression syntax a regular expression is a text pattern that consists of ordinary characters (such as characters A through Z) and special characters (called metacharacters). This mode is described in the Find text main ...