C # Regular Expression tutorial and example,

Source: Internet
Author: User
Tags expression engine

C # Regular Expression tutorial and example,

For a while, regular expression learning was very popular. At that time, I could see several regular expression posts in one day at CSDN, during that time, I learned some basic knowledge through the C # string and regular expression reference manual published by the Forum and Wrox Press, and earned about 1000 points in CSDN, today, when I went to the "C # string and regular expression Reference Manual", I was missing it. At present, regular expressions are used less often. Sort out the previous notes and do not forget them.

(1) "@" symbol
Symbol below two ows table research room hot, when the morning at "@" although not the C # Regular Expression of the "member", but it often with C # Regular Expression out of double inbound. "@" Indicates that the string following it is a "verbatim string", which is not very understandable. For example, the following two statements are equivalent:
String x = "D: // My Huang // My Doc ";
String y = @ "D:/My Huang/My Doc ";
In fact, C # will report an error if it is declared as follows, because "/" is used in C # To implement escape, such as "/n" line feed:
String x = "D:/My Huang/My Doc ";

(2) Basic syntax characters.
/D 0-9 digits
The complement set of/D/d (take all the characters as the complete set, the same below), that is, all non-numeric characters
/W words, uppercase/lowercase letters, 0-9 digits, and underscores
/W/w completion set
/S blank characters, including linefeed/n, carriage return/r, Tab/t, vertical TAB/v, break/f
/S/s completion set
Any character except linefeed/n
[…] Match All characters listed in []
[^…] Match characters not listed in []
The following provides some simple examples:

1 string I = "/n"; 2 string m = "3"; 3 Regex r = new Regex (@ "/D "); 4 // same as Regex r = new Regex ("// D"); 5 // r. isMatch (I) Result: true 6 // r. isMatch (m) Result: false 7 8 string I = "%"; 9 string m = "3"; 10 Regex r = new Regex ("[a-z0-9]"); 11 // match lowercase letters or digits 12 // r. isMatch (I) Result: false13 // r. isMatch (m) Result: true

(3) Positioning characters
"Positioning character" represents a virtual character, which represents a location, you can also intuitively think that "positioning character" represents the tiny gap between a character and character.
^ Indicates that the character after it must be at the beginning of the string
$ Indicates that the character before it must be at the end of the string
/B matches the boundary of a word
/B matches a non-word boundary
In addition, the character before/A must be at the beginning of the character, and the character before/z must be at the end of the character string, the character before/Z must be at the end of the string or before the line break.
The following provides some simple examples:

1 string I = "Live for nothing, die for something"; 2 Regex r1 = new Regex ("^ Live for nothing, die for something $ "); 3 // r1.IsMatch (I) true 4 Regex r2 = new Regex ("^ Live for nothing, die for some $"); 5 // r2.IsMatch (I) false 6 Regex r3 = new Regex ("^ Live for nothing, die for some"); 7 // r3.IsMatch (I) true 8 9 string I = @ "Live for nothing, 10 die for something "; // multiple lines 11 Regex r1 = new Regex (" ^ Live for noth Ing, die for something $ "); 12 Console. writeLine ("r1 match count:" + r1.Matches (I ). count); // 013 Regex r2 = new Regex ("^ Live for nothing, die for something $", RegexOptions. multiline); 14 Console. writeLine ("r2 match count:" + r2.Matches (I ). count); // 015 Regex r3 = new Regex ("^ Live for nothing,/r/ndie for something $"); 16 Console. writeLine ("r3 match count:" + r3.Matches (I ). count); // 117 Regex r4 = new Regex ("^ Live for nothing, $"); 18 Console. writeLine ("r4 match count:" + r4.Matches (I ). count); // 019 Regex r5 = new Regex ("^ Live for nothing, $", RegexOptions. multiline); 20 Console. writeLine ("r5 match count:" + r5.Matches (I ). count); // 021 Regex r6 = new Regex ("^ Live for nothing,/r/n $"); 22 Console. writeLine ("r6 match count:" + r6.Matches (I ). count); // 023 Regex r7 = new Regex ("^ Live for nothing,/r/n $", RegexOpti Ons. multiline); 24 Console. writeLine ("r7 match count:" + r7.Matches (I ). count); // 025 Regex r8 = new Regex ("^ Live for nothing,/r$"); 26 Console. writeLine ("r8 match count:" + r8.Matches (I ). count); // 027 Regex r9 = new Regex ("^ Live for nothing,/r$", RegexOptions. multiline); 28 Console. writeLine ("r9 match count:" + r9.Matches (I ). count); // 129 Regex r10 = new Regex ("^ die for something $"); 30 Console. writeLine ("R10 match count:" + r10.Matches (I ). count); // 031 Regex r11 = new Regex ("^ die for something $", RegexOptions. multiline); 32 Console. writeLine ("r11 match count:" + r11.Matches (I ). count); // 133 Regex r12 = new Regex ("^"); 34 Console. writeLine ("r12 match count:" + r12.Matches (I ). count); // 135 Regex r13 = new Regex ("$"); 36 Console. writeLine ("r13 match count:" + r13.Matches (I ). count); // 137 Regex r14 = new R Egex ("^", RegexOptions. multiline); 38 Console. writeLine ("r14 match count:" + r14.Matches (I ). count); // 239 Regex r15 = new Regex ("$", RegexOptions. multiline); 40 Console. writeLine ("r15 match count:" + r15.Matches (I ). count); // 241 Regex r16 = new Regex ("^ Live for nothing,/r$/n ^ die for something $", RegexOptions. multiline); 42 Console. writeLine ("r16 match count:" + r16.Matches (I ). count); // 143 // For a multi-line string After the Multiline option is set, ^ and $ match multiple times. 44 45 string I = "Live for nothing, die for something"; 46 string m = "Live for nothing, die for some thing "; 47 Regex r1 = new Regex (@ "/bthing/B"); 48 Console. writeLine ("r1 match count:" + r1.Matches (I ). count); // 049 Regex r2 = new Regex (@ "thing/B"); 50 Console. writeLine ("r2 match count:" + r2.Matches (I ). count); // 251 Regex r3 = new Regex (@ "/bthing/B"); 52 Console. writeLine ("r3 match count:" + r3.Matches (m ). count); // 153 Regex r4 = new Regex (@ "/bfor something/B"); 54 Console. writeLine ("r4 match count:" + r4.Matches (I ). count); // 155 // B is usually used to constrain a complete word

(4) repeated description characters
"Repeated description characters" is one of the places that reflect C # regular expressions "very powerful:
{N} matches the previous CHARACTER n times
{N,} matches the previous CHARACTER n times or more than n times
{N, m} matches the previous characters n to m
? Match the first character 0 or 1 time
+ Match the previous character once or more
* Match the first character 0 times or equal to 0 times
The following provides some simple examples:

1 string x = "1024"; 2 string y = "+ 1024"; 3 string z = "1,024"; 4 string a = "1 "; 5 string B = "-1024"; 6 string c = "10000"; 7 Regex r = new Regex (@ "^/+? [1-9],? /D {3} $ "); 8 Console. writeLine ("x match count:" + r. matches (x ). count); // 1 9 Console. writeLine ("y match count:" + r. matches (y ). count); // 110 Console. writeLine ("z match count:" + r. matches (z ). count); // 111 Console. writeLine ("a match count:" + r. matches (). count); // 012 Console. writeLine ("B match count:" + r. matches (B ). count); // 013 Console. writeLine ("c match count:" + r. matches (c ). count); // 014 // match 1000 to 99 An integer of 99. 15 // http://www.cnblogs.com/sosoft/

(5) select one matching
The (|) symbol in the C # regular expression does not seem to have a special title, so it is called "select a match. In fact, like [a-z] is also an alternative match, except that it can only match a single character, while (|) provides a larger range, (AB | xy) matches AB or xy. Note that "|" and "()" are a whole. The following provides some simple examples:

1 string x = "0"; 2 string y = "0.23"; 3 string z = "100"; 4 string a = "100.01"; 5 string B = "9.9 "; 6 string c = "99.9"; 7 string d = "99. "; 8 string e =" 00.1 "; 9 Regex r = new Regex (@" ^/+? (100 (. 0 +) *) | ([1-9]? [0-9]) (/. /d +) *) $ "); 10 Console. writeLine ("x match count:" + r. matches (x ). count); // 111 Console. writeLine ("y match count:" + r. matches (y ). count); // 112 Console. writeLine ("z match count:" + r. matches (z ). count); // 113 Console. writeLine ("a match count:" + r. matches (). count); // 014 Console. writeLine ("B match count:" + r. matches (B ). count); // 115 Console. writeLine ("c match count:" + r. matches (c ). count); // 116 C Onsole. writeLine ("d match count:" + r. matches (d ). count); // 017 Console. writeLine ("e match count:" + r. matches (e ). count); // 018 // match the number from 0 to 100. The outer brackets contain two parts: "(100 (. 0 +) *)" and "([1-9]? [0-9]) (/. /d +) * ", the two parts are the" OR "relationship, that is, the Regular Expression Engine will first try to match 100. If it fails, then try to match the last expression (representing a number in the range [0,100 ).

(6) Matching of special characters
The following provides some simple examples:

 

 

(7) group and non-capture group
The following provides some simple examples:

1 string x = "Live for nothing, die for something"; 2 string y = "Live for nothing, die for somebody "; 3 Regex r = new Regex (@ "^ Live ([a-z] {3}) no ([a-z] {5 }), die/1 some/2 $ "); 4 Console. writeLine ("x match count:" + r. matches (x ). count); // 1 5 Console. writeLine ("y match count:" + r. matches (y ). count); // 0 6 // The Regular Expression Engine remembers the matched content in "()" as a "group" and can be referenced by indexes. "/1" in the expression is used to reference the first group in the expression in reverse direction, that is, the content of the first bracket marked in bold, and "/2. 7 8 string x = "Live for nothing, die for something"; 9 Regex r = new Regex (@ "^ Live for no ([a-z] {5 }), die for some/1 $ "); 10 if (r. isMatch (x) 11 {12 Console. writeLine ("group1 value:" + r. match (x ). groups [1]. value); // output: thing13} 14 // obtain the content in the group. Note: This is Groups [1], because Groups [0] is the entire matching string, that is, the content of the entire variable x. 15 // http://www.cnblogs.com/sosoft/16 17 string x = "Live for nothing, die for something"; 18 Regex r = new Regex (@ "^ Live for no (? <G1> [a-z] {5}), die for some/1 $ "); 19 if (r. isMatch (x) 20 {21 Console. writeLine ("group1 value:" + r. match (x ). groups ["g1"]. value); // output: thing22} 23 // index by group name. Use the following format to identify a group name (? <Groupname> ...). 24 25 string x = "Live for nothing"; 26 Regex r = new Regex (@ "([a-z] +)/1"); 27 if (r. isMatch (x) 28 {29 x = r. replace (x, "$1"); 30 Console. writeLine ("var x:" + x); // output: Live for nothing31} 32 // Delete the repeated "nothing" in the original string ". In addition to the expression, use "$1" to reference the first group. The following is a group name reference: 33 string x = "Live for nothing "; 34 Regex r = new Regex (@"(? <G1> [a-z] +)/1 "); 35 if (r. isMatch (x) 36 {37 x = r. replace (x, "$ {g1}"); 38 Console. writeLine ("var x:" + x); // output: Live for nothing39} 40 41 string x = "Live for nothing "; 42 Regex r = new Regex (@ "^ Live for no (?: [A-z] {5}) $ "); 43 if (r. isMatch (x) 44 {45 Console. writeLine ("group1 value:" + r. match (x ). groups [1]. value); // output: (null) 46} 47 // Add "?: "Indicates that this is a" non-capturing group ", that is, the engine will not save the content of this group.

(8) greedy and non-greedy
The engine of the regular expression is greedy. As long as the mode permits, it will match as many characters as possible. Add "?" after "repeated description characters" (*, +), You can change the matching mode to non-greedy. See the following example:

1 string x = "Live for nothing, die for something"; 2 Regex r1 = new Regex (@". * thing "); 3 if (r1.IsMatch (x) 4 {5 Console. writeLine ("match:" + r1.Match (x ). value); // output: Live for nothing, die for something 6} 7 Regex r2 = new Regex (@". *? Thing "); 8 if (r2.IsMatch (x) 9 {10 Console. WriteLine (" match: "+ r2.Match (x). Value); // output: Live for nothing11}

(9) backtracking and non-backtracking
Use "(?> ...)" Method. Due to the greedy nature of the Regular Expression Engine, in some cases, it will be traced back for matching. See the following example:

1 string x = "Live for nothing, die for something"; 2 Regex r1 = new Regex (@". * thing, "); 3 if (r1.IsMatch (x) 4 {5 Console. writeLine ("match:" + r1.Match (x ). value); // output: Live for nothing, 6} 7 Regex r2 = new Regex (@ "(?>. *) Thing, "); 8 if (r2.IsMatch (x) // does not match 9 {10 Console. writeLine ("match:" + r2.Match (x ). value); 11} 12 // in r1, ". * "because of its greedy feature, it will always match the end of the string, and then match" thing ", but fails when", ". In this case, the engine will trace back and,. 13 // in r2, the entire expression fails to be matched due to forced non-backtracking.

(10) forward and reverse pre-Search
Forward pre-search declaration format: positive declaration "(? = ...)", Negative statement "(?!...)" The Declaration itself is not part of the final matching result. Please refer to the following example:

1 string x = "1024 used 2048 free"; 2 Regex r1 = new Regex (@ "/d {4 }(? = Used) "); 3 if (r1.Matches (x ). count = 1) 4 {5 Console. writeLine ("r1 match:" + r1.Match (x ). value); // output: 1024 6} 7 Regex r2 = new Regex (@ "/d {4 }(?! Used) "); 8 if (r2.Matches (x ). count = 1) 9 {10 Console. writeLine ("r2 match:" + r2.Match (x ). value); // output: 204811} 12 // The positive declaration in r1 indicates that the four digits must be followed by "used ", the negative statement in r2 indicates that the four digits cannot be followed by "used ".

Reverse pre-search declaration format: positive declaration "(? <=) ", Negative statement" (? <!)", The Declaration itself is not part of the final matching result. See the following example:

 

1 string x = "used: 1024 free: 2048"; 2 Regex r1 = new Regex (@"(? <= Used :)/d {4} "); 3 if (r1.Matches (x ). count = 1) 4 {5 Console. writeLine ("r1 match:" + r1.Match (x ). value); // output: 1024 6} 7 Regex r2 = new Regex (@"(? <! Used :)/d {4} "); 8 if (r2.Matches (x ). count = 1) 9 {10 Console. writeLine ("r2 match:" + r2.Match (x ). value); // output: 204811} 12 // reverse positive declaration in r1 indicates that "used:" must be followed before four digits :", the reverse negative declaration in r2 indicates that the string except "used:" must be followed before the four digits.

(11) hexadecimal character range
In a regular expression, you can use "/xXX" and "/uXXXX" to indicate a character range ("X" indicates a hexadecimal number:
The character of the/xXX number in the range of 0 to 255. For example, the space can be expressed by "/x20.
The/uXXXX character can be expressed by "/u" plus the 4-digit hexadecimal number of its number. For example, the Chinese character can be expressed by "[/u4e00-/u9fa5.

Http://www.cnblogs.com/sosoft/


(12) relatively complete matching for [0,100]
The following is a comprehensive example. For matching [0,100], special considerations include:
* 00 legal, 00. Legal, 00.00 legal, 001.100 legal
* The Null String is invalid. Only the decimal point is invalid. The value greater than 100 is invalid.
* The value can be suffixed. For example, "1.07f" indicates that the value is of the float type (not considered)

 1 Regex r = new Regex(@"^/+?0*(?:100(/.0*)?|(/d{0,2}(?=/./d)|/d{1,2}(?=($|/.$)))(/./d*)?)$"); 2 string x = ""; 3 while (true) 4 { 5     x = Console.ReadLine(); 6     if (x != "exit") 7     { 8         if (r.IsMatch(x)) 9         {10             Console.WriteLine(x + " succeed!");11         }12         else13         {14             Console.WriteLine(x + " failed!");15         }16     }17     else18     {19         break;20     }21 }

(13) exact matching is sometimes difficult
In some cases, it is difficult to achieve exact matching, such as date, Url, and Email address. In some cases, you even need to study some specialized documents to write accurate and complete expressions. In this case, you can only return to the next step to ensure exact matching. For example, you can consider a short period of time based on the actual situation of the application system, or for Email-like matching, you can only consider the most common form.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.