VBS Tutorial: Regular Expression introduction-Back Reference

Source: Internet
Author: User

Backward reference

One of the most important features of a regular expression is to store a part of the matched pattern for future use. Recall that adding parentheses on both sides of a regular expression mode or partial mode will cause this expression to be stored in a temporary buffer. Can I use non-captured metacharacters '? :','? = ', Or '?! 'To ignore the storage of this part of the regular expression.

Each captured sub-match is stored in the content from left to right in the regular expression mode. The buffer number that stores the sub-match starts from 1 and ranges from consecutive numbers to a maximum of 99 subexpressions. Each buffer can use '\N'Access, whereNA one-or two-digit decimal number that identifies a specific buffer.

Backward reference is the simplest and most useful application, which provides the ability to determine the continuous occurrence of two identical words in a text. See the following sentence:

Is is the cost of of gasoline going up up?

According to the content, the above sentence is obviously prone to repeated words. If there is one way to modify the sentence without looking for the repetition of each word. The following Visual Basic Scripting Edition Regular Expression uses a subexpression to implement this function.

/\b([a-z]+) \1\b/gi

The equivalent VBScript expression is:

"\b([a-z]+) \1\b"

In this example, the subexpression is each of the parentheses. The captured expression contains one or more letter characters, that is, specified by '[a-z] +. The second part of the regular expression is a reference to the previously captured sub-match, that is, the second occurrence word matched by the append expression. '\ 1' is used to specify the first child match. The word boundary character ensures that only individual words are detected. Otherwise, phrases such as "is issued" or "this is" are incorrectly recognized by this expression.

In the Visual Basic Scripting Edition expression, the global sign ('G') after the regular expression indicates that the expression will be used to search for as many matches as possible in the input string. The case sensitivity is specified by the case sensitivity mark ('I') at the end of the expression. Multiline flag specifies a potential match that may appear at both ends of the line break. For VBScript, you cannot set various tags in expressions, but you must useRegExpObject properties to explicitly set.

Using the regular expression shown above, the following Visual Basic Scripting Edition code can use sub-matching information to replace two identical words with one in a text string:

var ss = "Is is the cost of of gasoline going up up?.\n";var re = /\b([a-z]+) \1\b/gim;       //Create a regular expression style.var rv = ss.replace(re,"$1");   //Replace two words with one word.

The closest equivalent VBScript code is as follows:

Dim ss, re, rvss = "Is is the cost of of gasoline going up up?." & vbNewLineSet re = New RegExpre.Pattern = "\b([a-z]+) \1\b"re.Global = Truere.IgnoreCase = Truere.MultiLine = Truerv = re.Replace(ss,"$1")

Note that in VBScript code, global, Case sensitivity, and multi-line markup are all usedRegExpObject.

InReplaceUse in Method$1To reference the saved first child match. If multiple child matches, you can use$2,$3.

Another purpose of backward reference is to break a universal resource identifier (URI) into components. Assume that you want to break down the following Uris into the Protocol (ftp, http, etc), domain name address, and page/path:

http://msdn.microsoft.com:80/scripting/default.htm

The following regular expression can provide this function. For Visual Basic Scripting Edition, it is:

/(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)/

For VBScript:

"(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)"

The first additional sub-expression is the protocol part used to capture the web address. This subexpression matches any word that is located before a colon and two forward slashes. The second append subexpression captures the domain name address of the address. This subexpression does not match any character sequence of the '^', '/', or ':' character. The third append subexpression captures the website port number. If this port number is specified. This subexpression matches zero or multiple numbers followed by a colon. Finally, the fourth additional sub-expression captures the path specified by the web address and the \ or page information. This subexpression matches one or more characters except '#' or spaces.

After applying the regular expression to the URI shown above, the Child match contains the following content:

RegExp. $1Contains "http"

RegExp. $2Contains "msdn.microsoft.com"

RegExp. $3Include ": 80"

RegExp. $4Include "/scripting/default.htm"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.