Locator character
Until now, the examples you see are only looking for chapter headings that appear anywhere. Any occurrence of a string ' Chapter ' followed by a space and a number may be a true chapter title, or it may be a cross-reference to other chapters. Because true chapter headings always appear at the beginning of a line, you need to design a method to find only the headings instead of looking for cross-references.
This feature is provided by the locator character. A locator can fix a regular expression at the beginning or end of a line. You can also create regular expressions that appear only within a word or only at the beginning or end of a word. The following table contains a list of regular expressions and their meanings:
character |
Description |
^ |
Matches the start position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '. |
$ |
Matches the end position of the input string. If the Multiline property of the RegExp object is set, the $ also matches the position before ' \ n ' or ' \ R '. |
\b |
Matches a word boundary, which is the position between the word and the space. |
\b |
Matches a non-word boundary. |
Qualifiers cannot be used on the locator. An expression such as ' ^* ' is not allowed because there are no consecutive positions in front of or behind a newline character or word boundary.
To match the start of a line of text, use the ' ^ ' character at the beginning of the regular expression. Do not confuse the syntax of ' ^ ' with the syntax in the bracket expression. Their syntax is fundamentally different.
To match the text at the end of a line of text, use the ' $ ' character at the end of the regular expression.
To use a locator when finding chapter headings, the following Visual Basic scripting Edition Regular Expressions match the chapter headings at the beginning of a line with up to two digits:
/^Chapter [1-9][0-9]{0,1}/
The regular expressions for the same functionality in VBScript are as follows:
"^Chapter [1-9][0-9]{0,1}"
A true chapter title not only appears at the beginning of a line, but only in this line, so it must also be at the end of a line. The following expression ensures that the specified match matches only the chapter and does not match the cross-reference. It is accomplished by creating a regular expression that matches the start and end positions of only one line of text.
/^Chapter [1-9][0-9]{0,1}$/
For VBScript use:
"^Chapter [1-9][0-9]{0,1}$"
Matching word boundaries is a little different, but it adds a very important function to regular expressions. A word boundary is the position between a word and a space. A non word boundary is any other location. The following Visual Basic scripting Edition expression will match the first three characters of the word ' Chapter ' because they appear after the word boundary:
/\bCha/
For VBScript:
"\bCha"
The position of the ' \b ' operator here is critical. If it is at the beginning of the string to match, the match is found at the beginning of the word, or if it is at the end of the modified string, the match is found at the end of the word. For example, the following expression will match the ' ter ' in the word ' Chapter ' because it appears before the word boundary:
/ter\b/
And
"ter\b"
The following expression will match ' apt ' because it is located in the middle of ' Chapter ' but will not match ' apt ' in ' aptitude ':
/\Bapt/
And
"\Bapt"
This is because in the word ' Chapter ', ' apt ' appears in a non word boundary position and in the word ' aptitude ' at the word boundary. The position of a non-word boundary operator is not important because the match is independent of the beginning or end of a word.