Establish regular expressions
The method for constructing regular expressions is the same as for creating mathematical expressions. That is, using multiple metacharacters and operators to combine small expressions to create larger expressions. You can construct a regular expression by putting together various components of an expression pattern between a pair of delimiters. For JScript, the delimiter is a pair of forward slash (/) characters. For example:/expression/. For VBScript, a pair of quotation marks ("") is used to determine the bounds of the regular expression. For example: "expression".
In the two examples shown above, regular expression patterns (expression) are stored in the Pattern property of the RegExp object. The component of a regular expression can be a single character, character set, character range, selection between characters, or any combination of any of these components.
Order of Precedence
After a regular expression is constructed, it can be evaluated like a mathematical expression, that is, it can be evaluated from left to right and in order of precedence. The following table lists the order of precedence for the various regular expression operators, from highest priority to lowest priority:
operator |
Description |
/ |
Escape character |
(), (?:), (?=), [] |
Parentheses and square brackets |
*, +,?, {n}, {n,}, {n,m} |
Qualifier |
^, $,/anymetacharacter |
Location and order |
| |
"or" action |
Ordinary characters
Normal characters are made up of all print and nonprinting characters that are not explicitly specified as metacharacters. This includes all uppercase and lowercase alphabetic characters, all numbers, all punctuation marks, and some symbols. The simplest regular expression is a single ordinary character that matches the character itself in the searched string. For example, the single character pattern ' a ' can match the letter ' a ' that appears anywhere in the searched string. Here are some examples of word single-character patterns:
/a/
/7/
/m/
The equivalent VBScript word single-character expression is:
A
"7"
M
You can combine multiple single characters together to get a larger expression. For example, the following JScript regular expression is nothing more than an expression created by combining a single character expression ' a ', ' 7 ', and ' M '.
/a7m/
The equivalent VBScript expression is: "a7m"
Please note that there are no connection operators here. All you have to do is put one character behind another character.
Special characters
There are a number of metacharacters that require special processing when trying to match them. To match these special characters, you must first escape the characters, which means that you use a backslash (/) earlier. These special characters and their meanings are given in the following table:
Special Characters |
Description |
$ |
Matches the end position of the input string. If the Multiline property of the RegExp object is set, then $ also matches '/n ' or '/R '. To match the $ character itself, use/$. |
( ) |
Marks the start and end position of a subexpression. The subexpression can be obtained for later use. To match these characters, use/(and/). |
* |
Matches the preceding subexpression 0 or more times. To match the * character, use/*. |
+ |
Matches the preceding subexpression one or more times. to match the + character, use/+. |
. |
Matches any single character except the newline character/n. to match., please use/. |
[ |
Marks the beginning of a bracket expression. To match [, use/[. |
? |
Matches the preceding subexpression 0 or more times, or indicates a non-greedy qualifier. Want to match? characters, please use/?. |
/ |
Marks the next character as either a special character, or a literal character, or a backward reference, or a octal escape character. For example, ' n ' matches the character ' n '. ' n ' matches the newline character. Sequence '//' match '/' and '/(' Match ' ('. |
^ |
Matches the starting position of the input string, unless used in a bracket expression, at which point it means that the character set is not accepted. To match the ^ character itself, use/^. |
{ |
The beginning of a tag qualifier expression. To match {, use/{. |
| |
Indicates a choice between two items. to match |, use/|. |
non-printable characters
There are a number of very useful nonprinting characters that must be used occasionally. The following table shows the escape sequences used to represent these nonprinting characters:
character |
meaning |
/cx |
Matches the control character indicated by X. For example,/cm matches a control-m or carriage return character. The value of x must be one-a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character. |
/F |
Matches a page feed character. Equivalent to/x0c and/CL. |
/n |
Matches a line feed character. Equivalent to/x0a and/CJ. |
/R |
Matches a carriage return character. Equivalent to/x0d and/cm. |
/s |
Matches any white space character, including spaces, tabs, page breaks, and so on. equivalent to [/f/n/r/t/v]. |
/S |
Matches any non-white-space character. equivalent to [^/f/n/r/t/v]. |
/t |
Matches a tab character. Equivalent to/x09 and/ci. |
/V |
Matches a vertical tab. Equivalent to/x0b and/ck. |
Character Matching
a period (.) matches any single printed or nonprinting character in a string, except for line breaks (/n). The following JScript regular expressions can match ' AAC ', ' abc ', ' ACC ', ' ADC ' and so on, and can also match ' A1c ', ' a2c ', a-c ' and a#c ':
/a.c/equivalent VBScript regular expression is: "A.C"
If you try to match a string that contains a file name, where the period (.) is part of the input string, you can implement this requirement by preceding the period in the regular expression with a backslash (/) character. For example, the following JScript regular expression can match ' filename.ext ':
/filename/.ext/for VBScript, the equivalent expression looks like this:
"Filename/.ext"
These expressions are still quite limited. They only allow matching of any single character. In many cases, it is useful to match Special characters from a list. For example, if your input text contains chapter headings that are represented numerically as Chapter 1, Chapter 2, and so on, you may need to find these chapter headings.
Bracket expression
You can create a list to match by putting one or more single characters in a square bracket ([and]). If the character is enclosed in parentheses, the list is called a bracket expression. In parentheses and anywhere else, ordinary characters represent themselves, that is, they match the one that appears in the input text. Most special characters lose their meaning when they are in a bracket expression. Here are some exceptions: the '] ' character ends a list if it is not the first item. To match the '] ' character in the list, place it in the first item, immediately after the start ' ['. '/' is still an escape character. To match the '/' character, please use '//'.
The characters contained in the bracket expression match only a single character of the bracket expression where it is located in the regular expression. The following JScript regular expressions can match ' Chapter 1 ', ' Chapter 2 ', ' Chapter 3 ', ' Chapter 4 ' and ' Chapter 5 ':/chapter [12345]/to match the same chapters in VBScript Title, use the following expression: "Chapter [12345]"
Note that the word ' Chapter ' and the following spaces are fixed with the position of the characters in parentheses.
Therefore, the bracket expression is used only to specify a character set that satisfies the single characters position immediately following the word ' Chapter ' and a space. This is the Nineth character position.
If you want to use a range instead of the character itself to represent a character to be matched, you can use a hyphen to separate the start and end characters of the range. The character value of each character determines its relative order within a range. The following JScript regular expression contains a range expression that is equivalent to the list of parentheses shown above.
The expressions for the same functionality in/chapter [1-5]/vbscript are shown below: "Chapter [1-5]"
If you specify a range in this manner, both the start and end values are included within that range. It is particularly important to note that the starting value in a Unicode sort must precede the end value.
If you want to include hyphens in parentheses expressions, you must use one of the following methods: Escape with a backslash: [/-] place hyphens at the beginning and end of the list of parentheses. The following expression matches all lowercase letters and hyphens: [-a-z] [a-z-] creates a range where the value of the start character is less than the hyphen, and the end character's value is equal to or greater than the hyphen. The following two regular expressions all meet this requirement: [!--] [!-~]
Similarly, by placing an caret (^) at the beginning of the list, you can find all characters that are not in the list or range. If the caret appears elsewhere in the list, it matches itself without any special meaning. The following JScript regular expression matches chapter headings with chapter numbers greater than 5:
/chapter [^12345]/is used for VBScript: "Chapter [^12345]"
In the example shown above, the expression matches any numeric character except 1, 2, 3, 4, or 5 in the nineth position. So, ' Chapter 7 ' is a match, same ' Chapter 9 '.
The expression above can be expressed using a hyphen (-). For JScript:/chapter [^1-5]/or, for VBScript: "Chapter [^1-5]"
A typical use of bracket expressions is to specify a match for any uppercase or lowercase alphabetic characters or any number.
The following JScript expression gives this match: the/[a-za-z0-9]/equivalent VBScript expression is: "[a-za-z0-9]"