There was a time when regular expression learning was hot and trendy, and there were several regular expression posts in csdn one day, and that time, with the help of the Forum and the C # string and regular expression reference manual published by Wrox Press, learned some basic knowledge, At the same time, I have probably made 1000 points in csdn, and today I think about it, I have lost track of the C # string and regular expression reference manual. Now the use of the regular time is also relatively small, the previous notes and so tidy up, to the Chi do not forget.
(1) "@" symbol
Two ows table in the hot, when the morning at "@" although not a C # regular expression "member", but it often with C # Regular expression out of double-in pairs. "@" means that the string following it is a "verbatim string", not very well understood, for example, the two declarations are equivalent:
String x= "d://my huang//my Doc";
String y = @ "d:/my huang/my Doc";
In fact, C # will get an error if declared as follows, because "/" is used in C # to implement escaping, such as "/n" Wrapping:
string x = "d:/my huang/my Doc";
(2) basic grammatical characters.
Number of/D 0-9
The complement of the/d/d (so that the word identifier complete, the same as the same), that is, all non-numeric characters
/w word character, refers to uppercase and lowercase letters, 0-9 digits, underscores
The complement of/w/w
/s white space character, including line break/n, carriage return/R, Tab/T, vertical tab/V, page break/F
The complement of/S/S
. Any character other than the line break/n
[...] Match all the characters listed in []
[^ ...] Match characters that are not listed in []
Some simple examples are provided below:
1 String i = "/n"; 2 string m = "3"; 3 Regex r = new Regex (@ "/d"); 4//with regex r = new Regex ("//d"); 5//r.ismatch (i) Results: True 6//r.ismatch (m) Result: false 7 8 string i = "%"; 9 string m = "3"; regex r = new Regex ("[a-z0-9] "); 11//Match lowercase letters or numeric characters//r.ismatch (i) Results: False13//r.ismatch (m) Results: True
(3) Positioning characters
The "anchor character" represents a virtual character, which represents a location, and you can intuitively assume that the "anchor character" represents the tiny gap between a character and a character.
^ indicates that subsequent characters must be at the beginning of the string
$ indicates that the preceding character must be at the end of the string
/b matches the boundary of a word
/b matches a non-word boundary
Also, include:/A before the character must be at the beginning of the character,/z before the character must be at the end of the string,/z before the character must be at the end of the string, or in front of the newline character
Some simple examples are provided below:
1 String i = "Live for Nothing,die for something"; 2 regex r1 = new Regex ("^live for Nothing,die for something$"); 3//r1. IsMatch (i) true 4 regex r2 = new Regex ("^live for Nothing,die for some$"); 5//R2. IsMatch (i) false 6 regex r3 = new Regex ("^live for Nothing,die for some"); 7//r3. IsMatch (i) True 8 9 String i = @ "Live for nothing,10 die for Something";//multiple lines of regex r1 = new Regex ("^live for Nothing,di E for something$ "); Console.WriteLine (" R1 Match count: "+ R1. Matches (i). Count);//013 regex r2 = new Regex ("^live for Nothing,die for something$", Regexoptions.multiline), Console.WriteLine ("R 2 Match count: "+ R2. Matches (i). Count),//015 regex r3 = new Regex ("^live for Nothing,/r/ndie for something$"), Console.WriteLine ("R3 Match count:" + R3. Matches (i). Count);//117 regex r4 = new Regex ("^live for nothing,$"), Console.WriteLine ("R4 match count:" + R4. Matches (i). Count);//019 regex R5 = new Regex ("^live for nothing,$", Regexoptions.multiline), Console.WriteLine ("R5 Match count:" + R5. Matches (i). Count);//021 regex r6 = new Regex ("^live for nothing,/r/n$"), Console.WriteLine ("R6 match count:" + R6. Matches (i). Count);//023 regex R7 = new Regex ("^live for nothing,/r/n$", Regexoptions.multiline), Console.WriteLine ("R7 Match count : "+ R7. Matches (i). Count);//025 regex r8 = new Regex ("^live for nothing,/r$"), and Console.WriteLine ("R8 Match count:" + R8. Matches (i). Count);//027 regex R9 = new Regex ("^live for nothing,/r$", Regexoptions.multiline); Console.WriteLine ("R9 Match count:" + R9. Matches (i). Count);//129 regex R10 = new Regex ("^die for something$"), and Console.WriteLine ("R10 Match count:" + R10. Matches (i). Count);//031 regex R11 = new Regex ("^die for something$", Regexoptions.multiline), + Console.WriteLine ("R11 Match count:" + R11. Matches (i). Count);//133 regex R12 = new Regex ("^"), Console.WriteLine ("R12 Match count:" + R12. Matches (i). Count);//135 regex R13 = new Regex ("$"), Console.WriteLine ("R13 Match count:" + R13. Matches (i). Count);//137 Regex R14 = NEW Regex ("^", Regexoptions.multiline), Console.WriteLine ("R14 Match count:" + R14. Matches (i). Count);//239 regex R15 = new Regex ("$", regexoptions.multiline), + Console.WriteLine ("R15 Match count:" + R15. Matches (i). Count);//241 regex R16 = new Regex ("^live for Nothing,/r$/n^die for something$", regexoptions.multiline); Console.Write Line ("R16 match count:" + R16. Matches (i). Count);//143//For a multiline string, after setting the multiline option, ^ and $ will appear multiple matches. The string i = "Live for Nothing,die for something", and the string m = "Live for Nothing,die for some thing"; Regex r1 = NE W Regex (@ "/bthing/b"), Console.WriteLine ("R1 Match count:" + R1. Matches (i). Count);//049 regex r2 = new Regex (@ "thing/b"); Console.WriteLine ("R2 Match count:" + R2. Matches (i). Count),//251 regex r3 = new Regex (@ "/bthing/b"), Console.WriteLine ("R3 Match count:" + R3. Matches (M). Count),//153 regex r4 = new Regex (@ "/bfor something/b"), Console.WriteLine ("R4 match count:" + R4. Matches (i). Count);//155///b is usually used to constrain a complete word
(4) Repeating description character
The repeating description character is one of the places where C # regular expressions are "very good and powerful":
{n} matches the preceding character n times
{N,} matches the preceding character n times or more than n times
{n,m} matches the preceding characters n to M times
? Matches the preceding character 0 or 1 times
+ Match previous characters 1 or more 1 times
* match the preceding character 0 times or 0 times
Here are a few simple examples:
1 string x = "1024"; 2 string y = "+1024"; 3 string z = "1,024"; 4 String A = "1"; 5 string b= "-1024"; 6 string c = "10000"; 7 Regex r = new Regex (@ "^/+?[ 1-9],?/d{3}$ "); 8 Console.WriteLine ("X Match count:" + r.matches (x). count);//1 9 Console.WriteLine ("Y match count:" + r.matches (y). count);//110 Console.WriteLine ("Z Match count:" + r.matches (z). count);//111 Console.WriteLine ("A match count:" + r.matches (a). count);//012 Console.WriteLine ("B Match count:" + r.matches (b). count);//013 Console.WriteLine ("C Match count:" + r.matches (c). Count);//014//matches integers from 1000 to 9999. //http://www.cnblogs.com/sosoft/
(5) Select a match
The (|) symbol in C # Regular expressions does not seem to have a special title, let's call it "choose a match". In fact, a [a-z] is also a choice match, except that it only matches a single character, and (|) is provided with a larger range, (AB|XY) indicates matching ab or matching xy. Note the "|" and "()" Here is a whole. Some simple examples are provided below:
1 string x = "0"; 2 string y = "0.23"; 3 string z = "100"; 4 string a = "100.01"; 5 string b = "9.9"; 6 string c = "99.9"; 7 String d = "99."; 8 String e = "00.1"; 9 Regex r = new Regex (@ "^/+?") (100 (. 0+) *) | ([1-9]? [0-9]) (/./d+) *) $ "), Console.WriteLine (" X Match count: "+ r.matches (x). count);//111 Console.WriteLine ("Y match count:" + r.matches (y). count);//112 Console.WriteLine ("Z Match count:" + r.matches (z). count);//113 Console.WriteLine ("A match count:" + r.matches (a). count);//014 Console.WriteLine ("B Match count:" + r.matches (b). count);//115 Console.WriteLine ("C Match count:" + r.matches (c). count);//116 Console.WriteLine ("D Match count:" + r.matches (d). count);//017 Console.WriteLine ("E Match count:" + r.matches (E). count);//018//matches the number 0 to 100. The outermost brackets contain two parts "(100 (. 0+) *)", "([1-9]?[ 0-9]) (/./d+) * ", these two parts are" OR "relationships, that is, the regular expression engine tries to match 100 first, and if it fails, attempts to match the latter expression (representing the number in the [0,100) range).
(6) Matching of special characters
Some simple examples are provided below:
(7) Group and non-capturing group
Here are a few simple examples:
1 string x = "Live for Nothing,die for something"; 2 string y = "Live for Nothing,die for Somebody"; 3 Regex r = new Regex (@ "^live ([A-z]{3}) No ([a-z]{5}), Die/1 some/2$"); 4 Console.WriteLine ("X Match count:" + r.matches (x). count);//1 5 Console.WriteLine ("Y match count:" + r.matches (y). Count);//0 6//The regular expression engine remembers what is matched in "()" as a "group" and can be referenced by means of an index. The "/1" in the expression is used to reverse the first group that appears in the expression, that is, the first bracketed content of the bold identifier, and then the "/2". 7 8 string x = "Live for Nothing,die for something"; 9 Regex r = new Regex (@ "^live for No ([a-z]{5}), die for some/1$"), if (R.ismatch (x)) one by one {Console.WriteLine ("Group1 Value: "+ r.match (x). GROUPS[1]. Value);//output: THING13}14//Gets the contents of the group. Note that this is groups[1], because Groups[0] is the entire matching string, that is, the contents of the entire variable x. +//HTTP://WWW.CNBLOGS.COM/SOSOFT/16-string x = "Live for Nothing,die for something"; regex r = new Regex (@ "^live F Or no (? <g1>[a-z]{5}), die for some/1$ "), if (R.ismatch (x)) {Console.WriteLine (" Group1 value: "+ r.match (x) . groups["G1"]. Value);//output: Thing22}23//Can be indexed by group name. Use the following format to identify the name of a group(? <groupname>..). string x = "Live for Nothing", and a regex r = new Regex (@ "([a-z]+)/1"), and if (R.ismatch (x)) x = R.re Place (x, "$"), Console.WriteLine ("var x:" + x);//output: Live for nothing31}32//delete "Nothing" repeated in the original string. In addition to the expression, use "$" to refer to the first group, which is referred to by the group name: x = "Live for Nothing"; the regex r = new Regex (@ "(? <g1>[a-z]+)/1" ), if (R.ismatch (x)) (x) × x = R.replace (×, "${G1}"), Console.WriteLine ("var x:" + x);//output: Live for nothing39 }40 a string x = "Live for Nothing", a regex r = new Regex (@ "^live for No" (?: [A-z]{5}) $ "), and an if (R.ismatch (x)) Onsole. WriteLine ("group1 value:" + r.match (x). GROUPS[1]. Value);//output: (NULL) 46}47//Before the group, add "?:" to indicate that this is a "non-capturing group", that is, the engine will not save the contents of the group.
(8) Greed and non-greed
The engine of the regular expression is greedy, and as long as the pattern allows, it will match as many characters as possible. You can change the matching pattern to non-greedy by adding "?" after "Repeat description character" (*,+). Take a look at the following example:
1 string x = "Live for Nothing,die for something"; 2 regex r1 = new Regex (@ ". *thing"); 3 if (R1. IsMatch (x)) 4 {5 Console.WriteLine ("Match:" + R1. Match (x). Value);//output: Live for Nothing,die for something 6} 7 regex r2 = new Regex (@ ". *?thing"); 8 if (R2. IsMatch (x)) 9 { Console.WriteLine ("Match:" + R2. Match (x). Value);//output: Live for nothing11}
(9) Backtracking and non-backtracking
Use "(...)" Non-retrospective declaration of the method. Because of the greedy nature of the regular expression engine, which in some cases causes it to backtrack to get a match, consider the following example:
1 string x = "Live for Nothing,die for something"; 2 regex r1 = new Regex (@ ". *thing,"); 3 if (R1. IsMatch (x)) 4 {5 Console.WriteLine ("Match:" + R1. Match (x). Value);//output: Live for nothing, 6} 7 regex r2 = new Regex (@ "(? >.*) thing,"); 8 if (R2. IsMatch (x))//mismatch 9 { Console.WriteLine ("Match:" + R2. Match (x). Value); 11}12//In R1, ". *" due to its greedy nature, will always match to the end of the string, then match "thing", but fails when matching "," the engine will backtrack and match successfully at "thing." 13//In R2, the entire expression match failed due to forced non-backtracking.
(10) Forward pre-search, reverse pre-search
Forward pre-Search declaration format: Positive declaration "(? = ...)", Negative declaration "(?! ) ", the declaration itself is not part of the final matching result, see the following example:
1 string x = "1024x768 used 2048 free"; 2 regex r1 = new Regex (@ "/d{4} (? = used)"); 3 if (R1. Matches (x). Count==1) 4 {5 Console.WriteLine ("R1 match:" + R1. Match (x). Value);//output: 1024x768 6} 7 Regex r2 = new Regex (@ "/d{4} (?! used)"); 8 if (R2. Matches (x). Count==1) 9 { Console.WriteLine ("R2 match:" + R2. Match (x). Value); Output: The positive declaration in 204811}12//R1 must be guaranteed to be followed by "used" after four digits, and a negative declaration in R2 means that a four-digit number cannot be followed by a "used".
Reverse pre-Search declaration format: Positive declaration "(? <=)", Negative Declaration "(? <!)", the declaration itself is not part of the final matching result, see the following example:
1 string x = "used:1024 free:2048"; 2 regex r1 = new Regex (@ "(? <=used:)/d{4}"); 3 if (R1. Matches (x). Count==1) 4 {5 Console.WriteLine ("R1 match:" + R1. Match (x). Value);//output: 1024x768 6} 7 Regex r2 = new Regex (@ "(? <!used:)/d{4}"); 8 if (R2. Matches (x). Count==1) 9 { Console.WriteLine ("R2 match:" + R2. Match (x). Value);//output: 204811 the reverse positive declaration in the}12//R1 means that it must be followed by "used:" before the 4-digit number, and the reverse negative declaration in R2 means that a string other than "used:" must be followed immediately before the 4-digit number.
(11) Hexadecimal character range
In regular expressions, you can use "/xxx" and "/uxxxx" to denote one character ("X" means a hexadecimal number) as a character range:
/xxx characters with a range of 0 to 255, for example: spaces can be represented by "/x20".
/uxxxx any character can be represented by using "/U" plus its numbered 4-digit hexadecimal number, for example: Kanji can be expressed using "[/u4e00-/u9fa5]".
http://www.cnblogs.com/sosoft/
(12) A more complete match for [0,100]
The following is a more comprehensive example of a match [0,100] where special considerations are required including
*00 Legal, 00. Legal, 00.00 Legal, 001.100 legal
* Empty string illegal, only the decimal point is not legal, more than 100 illegal
* values are suffixed, such as "1.07f" means that the value is a float type (not considered)
1 Regex r = new Regex (@ "^/+?0* (?: 100 (/.0*)? | (/d{0,2} (? =/./d) |/d{1,2} (? = ($|/.$))) (/./d*)?) $"); 2 string x = ""; 3 while (true) 4 {5 x = Console.ReadLine (); 6 if (x! = "Exit") 7 {8 if (R.ismatch (x)) 9 {Ten Cons Ole. WriteLine (x + "succeed!"); }12 else13 { Console.WriteLine (x + "failed!"); }16 }17 else18 { break;20 }21}
(13) Exact matching is sometimes difficult.
Some requirements to achieve accurate matching is difficult, such as: date, URL, email address, and some of them you even need to study some special documents to write accurate and complete expression, for this situation, can only be returned to the second, to ensure a more accurate match. For example, for a date, you can consider a short period of time based on the actual application system, or for a match like email, you can consider only the most common form.
C # Regular Expression tutorials and examples