. NET Framework class library RegEx class

Source: Internet
Author: User
Tags expression engine

ArticleSource: msdn. NET Framework class library RegEx class http://msdn.microsoft.com/zh-cn/library/system.text.regularexpressions.regex

 

Static and instance methods

Execute Regular Expression operations

Example

1. Use regular expressions to check repeated words in a string

2. Use a regular expression to check whether the string represents a currency value or whether it has a correct format that represents a currency value

3. Extract repeated word information in a string (self-writing)

 

The RegEx class indicates the Regular Expression Engine of the. NET Framework. It can be used to quickly analyze a large number of texts to find specific character patterns, extract, edit, replace or delete text substrings, or add extracted strings to the collection to generate reports.

RegEx Member

RegEx Constructor

RegEx Field

RegEx Method

RegEx attributes

 

Static and instance methods

After defining the regular expression mode, you can provide it to the Regular Expression Engine in either of the following two ways.

      1. Instantiate the RegEx object of the regular expression. To perform this operation, pass the regular expression mode to the RegEx constructor. The RegEx object is unchangeable. When you use a regular expression to instantiate a RegEx object, you cannot change the regular expression of the object.

      2. Provides regular expressions and text to be searched at the same time to the static (shared in Visual Basic) RegEx method. This allows you to use regular expressions without explicitly creating a RegEx object.

All RegEx pattern identification methods include both static and instance overloading.

The Regular Expression Engine must compile a specific mode before using this mode. Because the RegEx object is unchangeable, this is a one-time process that occurs when the RegEx class constructor or static method is called. To avoid repeated compilation of a single regular expression, the Regular Expression Engine caches the compiled regular expression used in static method calls. Therefore, regular expression pattern matching provides the same performance for static and instance methods.

Important

In. NET Framework versions 1.0 and 1.1, all compiled regular expressions are cached, regardless of whether they are used in the instance or static method calls. Starting from. NET Framework 2.0, only the regular expressions used in static method calls will be cached.

However, the cache system implemented by the Regular Expression Engine may have adverse effects on performance in the following two situations:

      1. When a large number of regular expressions are used for static method calls. By default, the Regular Expression Engine caches 15 recently used static regular expressions. If the applicationProgramIf you use more than 15 static regular expressions, you must recompile some regular expressions. To prevent such re-compilation, you can add the RegEx. cachesize attribute to an appropriate value.

      2. When an application uses a previously compiled regular expression to create a new RegEx object.

 

Execute Regular Expression operations

Whether you decide to instantiate a RegEx object and call its method or call a static method, the RegEx class provides the following pattern matching function:

      1. Verify matching. You can call the ismatch method to determine whether a match exists.

      2. Retrieve a single match. You can call the match method to retrieve a match object, which indicates the first match in a string or a part of a string. Subsequent matching items can be retrieved by calling the match. nextmatch method.

      3. Retrieve all matches. You can call the matches method to retrieve the system. Text. regularexpressions. matchcollection object, which indicates all matching items found in a string or part of a string.

      4. Replace the matched text. You can call the replace method to replace the matched text. This replacement text can also be defined using a regular expression. In addition, some replace methods include a matchevaluator parameter that enables you to define replacement texts programmatically.

      5. Create a string array, which consists of all parts of the input string. You can call the split method to split the input string at the position defined by the regular expression.

In addition to its matching mode method, the RegEx class also includes several special-purpose methods:

      1. The escape method can escape any character that may be interpreted as a regular expression operator in a regular expression or input string.

      2. The Unescape method removes these escape characters.

      3. The compiletoassembly method creates an Assembly that contains a predefined regular expression .. Net Framework contains examples of these special-purpose assemblies in the system. Web. regularexpressions namespace.

 

Example

1. The following example uses a regular expression to check repeated words in a string.Regular Expression \ B (? <Word> \ W +) \ s + (\ K <word>) \ B can be interpreted as follows.

\ B starts matching from the word boundary.

(? <Word> \ W +) matches one or more word characters (up to the word boundary ). Name the capture group word.

\ S + matches one or more white spaces.

(\ K <word>) matches the capture group named word.

\ B matches the word boundary.

Using system; using system. text. regularexpressions; public class test {public static void main () {// define a regular expression for repeated words. regEx RX = new RegEx (@ "\ B (? <Word> \ W +) \ s + (\ K <word>) \ B ", regexoptions. compiled | regexoptions. ignorecase); // define a test string. string text = "The quick brown fox jumped over the lazy dog. "; // find matches. matchcollection matches = Rx. matches (text); // report the number of matches found. console. writeline ("{0} matches found in: \ n {1}", matches. count, text); // report on each match. foreach (match in matches) {groupcollection groups = match. groups; console. writeline ("'{0}' repeated at positions {1} and {2}", groups ["word"]. value, groups [0]. index, groups [1]. index) ;}}// the example produces the following output to the console: // 3 matches found in: // The quick brown fox jumped over the lazy dog. // The 'repeated at positions 0 and 4 // 'fox' repeated at positions 20 and 25 // 'Dog' repeated at positions 50 and 54

 

2. Use a regular expression to check whether the string represents the currency value or whether it has the correct format that represents the currency value.In this case, the regular expression is dynamically generated from the numberformatinfo. Digest, currencydecimaldigits, numberformatinfo. currencysymbol, numberformatinfo. negativesign, and numberformatinfo. positivesign attributes of the current region. If the current system culture is en-us, the resulting regular expression will be ^ \ W * [\ +-]? \ W? \ $? \ W? (\ D *\.? \ D {2 }?) {1} $. This regular expression can be explained as shown in the following table.

^ Starts at the beginning of the string.

\ W * matches zero or multiple white spaces.

[\ +-]? Matches zero or one of the positive or negative signs.

\ W? Matches zero or one blank character.

\ $? Matches zero or one of the dollar signs.

\ W? Matches zero or one blank character.

\ D * matches zero or multiple decimal numbers.

\.? Matches zero or one decimal point.

\ D {2 }? Matches two decimal digits zero times or once.

(\ D *\.? \ D {2 }?) {1} matches at least one pattern in which integers and decimals are separated by decimal points.

$ Match the end of the string.

In this case, the regular expression assumes that the valid currency string does not contain group delimiters, and that the string has no decimal digits or decimal places defined by the currencydecimaldigits attribute of the current region.

Using system; using system. globalization; using system. text. regularexpressions; public class example {public static void main () {// get the current numberformatinfo object to build the regular // Expression Pattern dynamically. numberformatinfo NFI = numberformatinfo. currentinfo; // define the regular expression pattern. string Pattern; pattern = @ "^ \ W * ["; // get the positive and negative sign symbol S. Pattern + = RegEx. Escape (NFI. positivesign + NFI. negativesign) + @ "]? \ W? "; // Get the currency symbol. Pattern + = RegEx. Escape (NFI. currencysymbol) + @"? \ W? "; // Add integral digits to the pattern. pattern + = @ "(\ D *"; // Add the decimal separator. pattern + = RegEx. escape (NFI. currencydecimalseparator) + "? "; // Add the fractional digits. pattern + = @ "\ D {"; // determine the number of fractional digits in currency values. pattern + = NFI. currencydecimaldigits. tostring () + "}?) {1} $ "; RegEx rgx = new RegEx (pattern); // define some test strings. string [] tests = {"-42", "19.99", "0.001", "100 USD ",". 34 "," 0.34 "," 1,052.21 "," $10.62 "," + 1.43 ","-$0.23 "}; // check each test string against the regular expression. foreach (string test in tests) {If (rgx. ismatch (TEST) console. writeline ("{0} is a currency value. ", test); else console. writeline ("{0} is not a currency value. ", test) ;}}// the example displays the following output: //-42 is a currency value. // The value 19.99 is a currency value. // 0.001 is not a currency value. // 100 USD is not a currency value. //. 34 is a currency value. // The value 0.34 is a currency value. // 1,052.21 is not a currency value. // $10.62 is a currency value. // + 1.43 is a currency value. //-$0.23 is a currency value.

Because the regular expressions in this example are dynamically generated, therefore, during the design, we do not know whether the Regular Expression Engine may incorrectly interpret the currency, decimal, positive, and negative signs of the current region as regular expression language operators. To prevent any interpretation errors, this example passes each dynamically generated string to the escape method.

 

3. Extract repeated word information in a string (self-writing)

String text = "The quick brown fox over the lazy dog t2he 111 t2he Chinese 2E Chinese _ EF. "; response. write (Text + "<br/>"); // 1. find the repeated word RegEx RX = new RegEx (@"(? <Word> \ B \ W + \ B ).*? (\ K <word>) ", regexoptions. compiled | regexoptions. ignorecase); matchcollection matches = Rx. matches (text); foreach (match in matches) {string strresponse = ""; string strrepeat = match. groups ["word"]. value; // 2. extract the found word and obtain its position RegEx rx2 = new RegEx (strrepeat, regexoptions. ignorecase); matchcollection matches2 = rx2.matches (text); strresponse + = string. format ("the word {0} has been repeated {1} times, and its locations are:", strrepe At, matches2.count); foreach (match match2 in matches2) {strresponse + = match2.index + "";} response. write ("<br/>" + strresponse + "<br/>");}/** Note: \ W matches any word character.. Wildcard: matches any single character except \ n. *? Repeat any time, but try to repeat as little as possible (lazy match) * output: The quick brown fox over the lazy dog t2he 111 t2he Chinese 2E Chinese _ EF. the word "the" repeats three times, and its position is: 0 4 34. The word "Fox" repeats twice, and its position is: 20 25. The word "dog" repeats twice, the location is: 43 47 words t2he repeated twice, the location is: 51 60 words Chinese repeated twice, the location is: 65 71 word _ EF repeats twice, and its position is 74 78 */

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.