[】] Detailed usage of VBScript-Regexp object

Source: Internet
Author: User
Http://blog.csdn.net/yan11cn/article/details/5004279

Refer to the blog of Tom

 

Regexp objects are the objects used in VBScript to provide simple regular expressions. All attributes and methods related to regular expressions in VBScript are associated with this object.

Dim re

Set Re = new Regexp

This object has three attributes and three methods, as shown in table 9-1.

Table 9-1

Genus

Global attributes

Ignorecase attributes

Pattern attributes

Method

Execute Method

Replace Method

Test Method

The following sections will introduce these attributes and methods in depth. It also describes the regular expression symbols that you will use in the mode.

1 Global attributes

The global attribute is responsible for setting or returning a Boolean value, indicating whether the pattern matches all the matching places in the entire string or only the places that appear for the first time (see table 9-2 ).

Table 9-2

Code

Object. Global [= value]

Object

Regexp object

Value

There are two possible values: true and false.

 

If the value of the global attribute is true, the entire string will be searched; otherwise, no. The default value is false -- not true in some Microsoft documents

The following example uses the Global attribute to ensure that all "in" is modified.

Dim re, S

Set Re = new Regexp

Re. pattern = "/bin"

Re. Global = true

S = "The rain in Spain falls mainly on the plains ."

Msgbox re. Replace (S, "in the country ")

2 ignorecase attributes

The ignorecase attribute is used to set or return a Boolean value, indicating whether the mode matching is case sensitive (see table 9-3 ).

Table 9-3

Code

Object. ignorecase [= value]

Object

Regexp object

Value

There are two possible values: true and false.

 

If the value of the ignorecase attribute is false, the search is case sensitive. If it is true, it is not. The default value is false -- not true in some Microsoft documents

Continue to look at this example. We have read the global Attribute before. If the string to be matched contains "in", we must tell VBScript to ignore the case when matching.

Dim re, S

Set Re = new Regexp

Re. pattern = "/bin"

Re. Global = true

Re. ignorecase = true

S = "The rain in Spain falls mainly on the plains ."

Msgbox re. Replace (S, "in the country ")

3. Pattern attributes

Set the pattern attribute or return the regular expression used for search (see table 9-4 ).

All the preceding examples use pattern.

Dim re, S

Set Re = new Regexp

Re. pattern = "/bin"

Re. Global = true

S = "The rain in Spain falls mainly on the plains ."

Msgbox re. Replace (S, "in the country ")

Table 9-4

Code

Object. Pattern [= "searchstring"]

Object

Regexp object

SEARCH strings

The regular string expression to be searched. It may contain some regular expression characters -- Optional

4. Regular Expression characters

The strength of regular expressions does not come from the use of strings as the mode, but rather the use of special characters in the mode. Table 9-5 lists all these characters and their roles in the code.

The upper-case special characters are opposite to the lower-case special characters.

Table 9-5

Character

Description

/

Indicates that the next character is a special character or a literal constant.

^

Match the beginning of the Input

$

Match the end of input

*

Matches the previous character Zero or multiple times

+

Match the previous character once or multiple times

?

Match the first character Zero or once

.

Match any single character except line breaks

(Pattern)

Match and remember this pattern. [0]… [N] obtain the matched string from the matches set of the result. To match the brackets themselves, add a slash to the front -- use "/(" or "/)"

(? : Pattern)

Match but not capture mode, that is, the matching results are not stored for future use. This can be used to use different parts of the "or" character (|) Merge mode. For example, "anomal (? : Y | ies) "is much more cost-effective than" anomaly | anomalies"

(? = Pattern)

When the string to be searched matches the open header of the pattern, this part is matched. This is a non-capturing match, that is, the matching results will not be saved for future use. For example, "windows (? = 95 | 98 | nt | 2000 | XP | Vista) "matches windows in" Windows Vista ", but does not match windows in" Windows 3.1"

(?! Pattern)

In contrast to the previous one, this matches the content that does not appear in the pattern. This is a non-capturing match, that is, the matching results will not be saved for future use. For example, "windows (? = 95 | 98 | nt | 2000 | XP | Vista) "matches windows in" Windows 3.1 ", but does not match windows in" Windows Vista"

X | y

Match X or Y

(Continued table)

Character

Description

{N}

Exact match n times (N must be a non-negative integer)

{N ,}

Match at least N times (N must be a non-negative integer -- note the ending comma)

{N, m}

Match at least N times and at most m times (both M and N must be non-negative integers)

[Xyz]

Match any of these characters (XYZ indicates a character set)

[^ XYZ]

Match any character that is not included (^ XYZ indicates the complement set of a character set)

[A-Z]

Match characters in the specified range (a-Z indicates the range of characters)

[M-Z]

Match characters outside the specified range (^ m-Z indicates the completion set of the specified range)

/B

Match A Word boundary, which is located between the word and space.

/B

Match a non-word boundary

/D

Match a number. Equivalent to [0-9]

/D

Match non-numbers. Equivalent to [^ 0-9]

/F

Match a newline

/N

Match line breaks

/R

Match carriage return

/S

Matches blank spaces, including spaces, tabs, and page breaks. It is equivalent to "[/f/n/R/T/V]"

/S

Matches non-blank characters. It is equivalent to "[^/f/n/R/T/V]"

/T

Match tabs

/V

Match vertical tabs

/W

Match letters, numbers, and underscores. Equivalent to "[A-Za-z0-9 _]"

/W

Match non-character numbers. Equivalent to "[^ A-Za-z0-9/_]"

/.

Match.

/|

Match |

/{

Match {

/}

Match}

//

Match/

/[

Match [

/]

Match]

/(

Match (

/)

Matching)

$ Num

Matches num, where num is a positive integer. Returns a reference to the matching result.

/N

Match n, where n is an octal escape character. The length of the octal escape character must be 1, 2, or 3.

/Uxxxx

Match ASCII characters in Unicode form

/XN

Match n, where n is a hexadecimal escape character. The hexadecimal escape character must be two characters in length.

Many of the codes do not need to be described too much, but some examples may need help from others to understand them.

Match a class of Characters

You have seen a simple mode:

Re. pattern = "in"

It is usually used to match a class of characters. By placing the characters to be matched in square brackets. For example, the following example replaces a single number with a more general term.

Dim re, S

Set Re = new Regexp

Re. pattern = "[1, 23456789]"

S = "Spain received 3 millimeters of rain last week ."

Msgbox re. Replace (S, "success ")

The output of this Code is as follows.

 

Figure 9-11

In this example, the number "3" is replaced with the text "success ". As you expected, you can specify a range to shorten this mode. This mode has the same functionality as the previous one.

Dim re, S

Set Re = new Regexp

Re. pattern = "[2-9]"

S = "Spain received 3 millimeters of rain last week ."

Msgbox re. Replace (S, "success ")

Replace numbers and non-Numbers

You often need to replace numbers. In fact, because the mode [0-9] (including all numbers) is often used, there is an equivalent shortcut for [0-9]:/d.

Dim re, S

Set Re = new Regexp

Re. pattern = "/D"

S = "a B C D E F 1G 2 h... 10 Z"

Msgbox re. Replace (S, "a number ")

The string after replacement is 9-12.

 

Figure 9-12

What if I want to match non-numeric characters? Use the ^ symbol in square brackets.

The meaning of using ^ outside square brackets is completely different and will be discussed later.

In this way, the following pattern can be used to match non-numeric characters:

Re. pattern = "[^, 0-9]" the hard way

Re. pattern = "[^/d]" 'a little shorter

Re. pattern = "[/d]" 'Another of those special characters

The last mode uses another special character. In most cases, this special character only reduces the number of inputs (or an effective memory), but in some cases, for example, it is useful to match tabs and other characters that cannot be printed.

Anchoring and shortening Modes

There are three special characters used for the anchoring mode. They do not match any characters, but it can be required that the other mode must appear at the beginning of the input (use ^ outside of []) and end of the input ($) or word boundary (/B you have already seen ).

Another method to reduce the number of duplicates is to use the number of duplicates. The basic idea is to specify the number of repetitions after the pattern. For example, the following pattern, as shown in 9-13, can match multiple numbers and replace them.

Dim re, S

Set Re = new Regexp

Re. pattern = "/d {3 }"

S = "Spain received 100 millimeters of rain in the last 2 weeks ."

Msgbox re. Replace (S, "a whopping number ")

 

Figure 9-13

If the number of duplicates is not used in the code, as shown in 9-14, it will leave "00" in the last string ".

 

Figure 9-14

Dim re, S

Set Re = new Regexp

Re. pattern = "/D"

S = "Spain received 100 millimeters of rain in the last 2 weeks ./"

Msgbox re. Replace (S, "a whopping number ")

Note that RE. Global = true cannot be used here, because four "a whopping number of" will be generated in the result ". The result is 9-15.

 

Figure 9-15

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "/D"

S = "Spain received 100 millimeters of rain in the last 2 weeks ."

Msgbox re. Replace (S, "a whopping number ")

Specify the matching range or minimum number of times

You can specify the minimum number of matching times {min} Or the range {min, Max ,}. Some of the frequently used duplicate modes also have special shortcuts.

Re. pattern = "/d +" one or more digits,/d {1 ,}

Re. pattern = "/D *" 'zero or more digits,/d {0 ,}

Re. pattern = "/D? "'Optional: zero or one,/d {0, 1}

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "/d +"

S = "Spain received 100 millimeters of rain in the last 2 weeks ."

Msgbox re. Replace (S, "a number ")

The output of this Code is 9-16. Note that the string "100" is replaced.

 

Figure 9-16

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "/D *"

S = "Spain received 100 millimeters of rain in the last 2 weeks ."

Msgbox re. Replace (S, "a number ")

The output of the above Code is 9-17. This string is inserted between two non-numeric characters, and the number is replaced.

 

Figure 9-17

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "/D? "

S = "Spain received 100 millimeters of rain in the last 2 weeks ."

Msgbox re. Replace (S, "a number ")

The output of the above Code is 9-18. "A number" is inserted between two non-numeric characters, while the number is replaced.

 

Figure 9-18

Remember matching results

The last special character to be discussed is to remember the matching results. If you want to use partial or all of the matching results in the text to be replaced, this is useful-For details, see the replace method. One example uses the matching results in mind.

To verify this, and to bring together all discussions about special characters, let's do something practical. Search for a string and find the URL. To control the complexity and scale of this example, we only look for the "http:" protocol, but you can also handle various DNS domain names, including unlimited domain name levels. Don't worry about how to communicate with DNS. You only need to know that it is enough to enter a URL in the browser.

The code for the method of another Regexp object in the next section contains more details. Now, you only need to know that execute will execute a pattern match and return each matching result through the set. Here is the code:

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "http: // (/W + [/W-] */W +/.) */W +"

S = "http://www.kingsley-hughes.com is a valid Web address. And so is"

S = S & vbcrlf & "http://www.wrox.com. And"

S = S & vbcrlf & "http://www.pc.ibm.com-even with 4 levels ."

Set colmatches = Re. Execute (s)

For each match in colmatches

Msgbox "found valid URL:" & Match. Value

Next

As you wish, the main task is to set the code line of the mode. It seems a little daunting, but it is easy to understand. Let's break it down:

1. The mode starts with a fixed string http. Then enclose the main part of the pattern with parentheses. The highlighted mode below matches a level-1 DNS, including the vertices at the end:

Re. pattern = "http: // (/W [/W-] */W/.) */W +"

This pattern starts with a special character/w you have seen before, used to match [a-zA-Z0-9], that is, all the numbers and letters in English.

2. use parentheses to match letters, numbers, or horizontal bars, because there can be horizontal bars in DNS. Why not use the same pattern as above? It's easy, because effective DNS cannot start or end with a horizontal bar. Then, use * to repeat 0 or more characters.

Re. pattern = "http: // (/W [/W-] */W/... */W +"

3. Then strictly use letters and numbers so that the domain name will not end with a horizontal bar. The last pattern match in the brackets is used to split the points (.) of the DNS hierarchy (.).

Vertices cannot be used separately, because they are special characters. Normally, they can match any character except the linefeed. You can use a backslash to escape this character.

4. After encapsulating these things into parentheses, you only need to continue using the * repeat mode. Therefore, the highlighted mode below can match all valid domain names and their subsequent points. In other words, it can match the level-1 domain name in the entire DNS.

Re. pattern = "http: // (/W [/W-] */W/.) */W +"

5. The final mode is one or more characters required by top-level domain names (such as COM, org, and edu.

Re. pattern = "http: // (/W [/W-] */W/.) */W +"

9.2.5 execute Method

This method applies the regular expression to the string and returns the matches set. This is the startup switch that uses the pattern matching string in the Code. For more information, see table 9-6.

Table 9-6

Code

Object. Execute (string)

Object

It can only be a Regexp object

String

String to be searched-required

The pattern attribute of the Regexp object is used for regular expression search.

Dim re, S

Set Re = new Regexp

Re. Global = true

Re. pattern = "http: // (/W + [/W-] */W +/.) */W +"

S = "http://www.kingsley-hughes.com is a valid Web address. And so is"

S = S & vbcrlf & "http://www.wrox.com. And" s = S & vbcrlf &

"Http://www.pc.ibm.com-even with 4 levels ."

Set colmatches = Re. Execute (s)

For each match in colmatches

Msgbox "found valid URL:" & Match. Value

Next

Note that some languages have different processing methods for the results of regular expressions. Execute returns a Boolean value that determines whether the mode is found. Due to this difference, you will often see that the regular expressions converted from other languages cannot be used in VBScript.

Some Microsoft documents contain such errors, but most of them have been corrected.

Remember that the result of execute is a set, or even an empty set. You can use if Re. Execute (s). Count = 0 or the test method specifically designed for this purpose to test it.

6. Replace Method

This method is used to replace the text found in regular expression search. For more information, see table 9-7.

Table 9-7

Code

Object. Replace (string1, string2)

Object

It can only be a Regexp object

String 1

This is a replacement text string-required

String 2

This is a replacement text string-required

The replace method returns a copy of string1 after Regexp. pattern is replaced by string2. If no matching occurs in the string, string1 is returned without any change.

Dim re, S

Set Re = new Regexp

Re. pattern = "http: // (/W + [/W-] */W +/.) */W +"

S = "http://www.kingsley-hughes.com is a valid Web address. And so is"

S = S & vbcrlf & "http://www.wrox.com. And"

S = S & vbcrlf & "http://www.pc.ibm.com-even with 4 levels ."

Msgbox re. Replace (S, "** top secret! **")

The output of the above Code is 9-19.

 

Figure 9-19

The replace method can also replace the subexpression in the mode. This requires special characters such as $1 and $2 in the text to be replaced. These "Parameters" are the matching results that are remembered.

7 backreferencing

A remembered matching result is part of the pattern. This is the so-called backreferencing. You must use parentheses to specify the part to be stored in the temporary cache. Each captured matching result is stored in the order of matching (left to right in regular expression mode ). The cache that stores the matching results starts from 1. The maximum value is 99. You can access them with variables such as $1 and $2 in sequence.

Non-capturing metacharacters ("? :","? = "Or "?! ") Skip some parts of the regular expression.

In the following example, the first five words (consisting of one or more non-blank characters) will be remembered, and only four of them will appear in the replacement text:

Dim re, S

Set Re = new Regexp

Re. pattern = "(/S +)/S + (/S +)"

S = "VBScript is not very cool ."

Msgbox re. Replace (S, "$1 $2 $4 $5 ")

The output of this Code is 9-20.

 

Figure 9-20

Note that in this code, a pair (/S +)/S + is added for each Pronoun in the string. This allows the code to better control the strings to be processed. You can prevent the tail of a string from being added to the string to be displayed. Make sure that the output meets your requirements when using backreferencing!

8. Test Method

The test method performs regular expression search on the string and returns a Boolean value indicating whether the matching is successful. See table 9-8.

Table 9-8

Code

Object. Test (string)

Object

Regexp object

String

Execution object for regular expression search-required

If the match succeeds, the test method returns true; otherwise, false. This applies to determining whether a string contains a certain pattern. Note that you often need to set the mode to case sensitive, as shown in the following example:

Dim re, S

Set Re = new Regexp

Re. ignorecase = true

Re. pattern = "http: // (/W + [/W-] */W +/.) */W +"

S = "some long string with http://www.wrox.com buried in it ."

If Re. test (s) then

Msgbox "found a URL ."

Else

Msgbox "No URL found ."

End if

The output of this Code is 9-21.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.