Regular expressions is a set of syntax matching rules. Various languages, such as Perl,. Net, and Java, have their own shared Regular Expression Libraries. In. Net, this class library is called Regex.
To put it simply, Regex is an application class used to find matching strings from the character window. With Regex, programmers can easily extract the data information they need from a piece of data. Here is a simple example to give you a general understanding of Regex:
Regex regex = new Regex (@ "d + ");
Match m = regex. Match ("fox9212 gold ");
Console. WriteLine (m. Value. ToString ());
The result is obvious. regex finds the numeric string in the string "fox 9212gold" and the output result is "9212 ".
After having a basic concept for Regex, I need to tell you a very good news that Regex can do more than that for us, it is a set of powerful syntax matching rules. Of course, there is still a bad message waiting for us, that is, the powerful syntax rules naturally require a large number of complicated keyword support, which also brings great difficulties to Regex learning. To really grasp the regular expression, not a few samples can be fully revealed and explained.
Create a Regex object
There are three Regex constructor types. We will not discuss the default constructor here. In the other two constructors, one constructor receives the regular expression string as the input parameter, and the other uses the regular expression string and RegexOptions as the input parameter. For example:
Regex regex = new Regex ("w + $ ");
Regex regex = new Regex ("s +", RegexOptions. IgnoreCase | RegexOptions. Multiline );
RegexOptions can provide some special help for us. For example, IgnoreCase can ignore case sensitivity during matching, and Multiline can adjust the meaning of ^ and $ to match the beginning and end of a row.
We constructed a regular expression above, but we didn't use it to do anything. Now we can use the following methods to perform operations on string objects.
Match string
Regex has two matching methods: Match () and Matches (), which respectively indicate matching one and matching multiple results. Matches is used to show how to use Regex to obtain and display the matching string.
Public static void showMatches (string expression, RegexOptions option, string MS)
{
Regex regex = new Regex (expression, option );
MatchCollection matches = regex. Matches (MS );
// Show matches
Console. writeLine ("////////////////----------------------------------////////////////");
Console. WriteLine ("string:" {0} "expression:" {1} "match result is:", MS, expression );
Foreach (Match m in matches)
{Console. WriteLine ("match string is:" {0} ", length: {1}", m. Value. ToString (), m. Value. Length );
}
Console. WriteLine ("matched count: {0}", matches. Count );
}
The Matched method compares the input parameter strings and regular expressions to find all the matching results and transmits the results as MatchCollection. In this way, you can quickly obtain all the results by simply traversing the collection.
Group concept
When you get such a string "the final score is: 19/24", you certainly want to have a regular expression that not only can find a string like data1/data2, data1 and data2 should also be directly transmitted as separate results. Otherwise, you need to analyze the string in the form of "19/24" to obtain the score of both parties. Obviously, regular expressions do not ignore this issue, so it adds the group concept. You can put the search results into different groups respectively, and obtain the results of these groups by group name or group. For example, in the above example, we can use @ "(\ d +)/(\ d +)" as the expression. Let's take a look at the results:
Regex regex = new Regex (@ "(d +)/(d + )");
MatchCollection matches = regex. Matches (@ "the last score is: 19/24 ");
// Show matches
Console. writeLine ("////////////////----------------------------------////////////////");
Foreach (Match m in matches)
{
// Console. writeLine ("match string is:" {0} ", length: {1}", // m. value. toString (), m. value. length );
Foreach (string name in regex. GetGroupNames ())
{
Console. WriteLine ("capture group" {0} "value is:" {1} "", name, m. Groups [name]. Value );
}
}
Console. WriteLine ("matched count: {0}", matches. Count );
Output:
////////////////----------------------------------////////////////
Capture group "0" value is: "19/24"
Capture group "1" value is: "19"
Capture group "2" value is: "24"
Matched count: 1
Now it is clear that the Regex object puts the matching results into group 0. At the same time, the matched group information is also placed in the corresponding group. By default, the group name is an integer that increases from 1. 0 is the reserved name, which is used to indicate the entire string to be matched. Since there is a "default situation" concept, it naturally means that you can customize the group name. The method is very simple. Add the following before the Group :? <Name>. Now, change the regular expression @"(? <Score1> \ d + )/(? <Score1> \ d +) ", now let's look at the result:
////////////////----------------------------------////////////////
Capture group "0" value is: "19/24"
Capture group "score1" value is: "19"
Capture group "score2" value is: "24"
Matched count: 1
Change it to your own name, haha! To make it easier to see all the results in future tests, we have made a small adjustment to the showmatches () We have discussed earlier. In this way, if the expression contains the Group definition, we can also directly view all the group information without modifying any code. The adjusted method showmatchespro () as follows:
Public static void showmatchespro (string expression, regexoptions option, string MS)
{
RegEx = new RegEx (expression, option );
Matchcollection matches = RegEx. Matches (MS );
// Show Matches
Console. writeline ("////////////////----------------------------------////////////////");
Console. writeline ("string:" {0} "expression:" {1} "match result is:", MS, expression );
Foreach (Match m in matches)
{
Foreach (string name in RegEx. getgroupnames ())
{
Console. writeline ("capture group" {0} "value is:" {1} "", name, M. Groups [name]. value );
}
}
Console. writeline ("matched count: {0}", matches. Count );
// Show group name
Console. writeline ("group name count {0}", RegEx. getgroupnames (). Length );
Foreach (string name in RegEx. getgroupnames ())
{
Console. writeline ("group name:" {0} "", name );
}
}
Replace string
RegEx also provides a convenient matching result replacement function. To facilitate the test, we also write the method as follows:
Public static string replacematch (string expression, regexoptions option, string MS, string REP)
{
RegEx = new RegEx (expression, option );
String result = RegEx. Replace (MS, Rep );
Console. writeline ("////////////////----------------------------------////////////////");
Console. WriteLine ("string:" {0} ", expression:" {1} ", replace by:" {2} "", MS, expression, rep );
Console. WriteLine ("replace result string is:" {0} ", length: {1}", result. ToString (), result. Length );
Return result;
}
Regex. Replace generally accepts two strings as input parameters. The first string is the input string. The second string is used to replace the matching string. It can contain special strings to represent special conversions.
Special string replacement result
$ & Matched string, or $0
$1, $2,... match the corresponding group in the string and use the index
$ {Name} matches the corresponding group in the string with the name
$ 'Match the string before the position
$ 'String after matching position
$ A '$' character
$ _ Input string
$ + Data in the last group that matches the string
Have you read so many strange special strings? Well, let's get two samples to see the results!
Sample1:
ReplaceMatch (@ "\ d +", RegexOptions. None, "fef 12/21 df 33/14 727/1", "<$ &> ");
Output, All numeric data is replaced with <data>:
////////////////----------------------------------////////////////
String: "fef 12/21 df 33/14 727/1", expression: "\ d +", replace by: "<$ &>"
Replace result string is: "fef <12>/<21> df <33>/<14> <727>/<1> ",
Length: 50
Sample2:
ReplaceMatch (@ "(\ d +)/(\ d +)", RegexOptions. None, "fef 12/21 df 33/14 727/1", "$ + ");
Output, all data matched by data1/data2 is replaced with data2:
////////////////----------------------------------////////////////
String: "fef 12/21 df 33/14 727/1", expression: "(\ d +)/(\ d +)", replace by: "$ +"
Replace result string is: "fef 21 df 14 1", length: 16
How about it? The Regex features are rich enough! However, maybe your requirements are not that simple. For example, you need to double the money in the middle of "I have 200 dollars". What should you do? I fainted, as if there were no ready-made items to use. It doesn't matter. Regex has better functions. It allows you to define conversion formulas by yourself.
Using System. Text. RegularExpressions;
Class RegularExpressions
{
Static string CapText (Match m)
{
// Get the matched string.
String x = m. ToString ();
// Double this value
String result = (int. Parse (x) * 2). ToString ();
Return result;
}
Static void Main ()
{
String text = "I have 200 dollars ";
String result = Regex. Replace (text, @ "d +", new MatchEvaluator (RegularExpressions. CapText ));
System. Console. WriteLine ("result = [" + result + "]");
}
}
Look at the results. That's good. My money has actually doubled!
However, the purpose of this article is to provide you with a convenient and easy-to-use test class. Therefore, we reload the repalceMatch method above and allow custom conversion formulas as input parameters:
Public static string replaceMatch (string expression, RegexOptions option, string MS,
MatchEvaluator evaluator)
{
Regex regex = new Regex (expression, option );
String result = regex. Replace (MS, evaluator );
Console. writeLine ("////////////////----------------------------------////////////////");
Console. WriteLine ("string:" {0} ", expression:" {1} ", replace by a evaluator.", MS, expression );
Console. WriteLine ("replace result string is:" {0} ", length: {1}", result. ToString (), result. Length );
Return result;
}
Split string
Regex also provides a method to Split strings from matching locations. This method also has multiple reloads, but these are not the focus. You can read the document by yourself. We continue to complete our method for testing:
Public static void splitMatch (string expression, RegexOptions option, string MS)
{
Regex regex = new Regex (expression, option );
String [] result = regex. Split (MS );
Console. writeLine ("////////////////----------------------------------////////////////");
Console. WriteLine ("string:" {0} ", expression:" {1} ", split result is:", MS, expression );
Foreach (string m in result)
{
Console. WriteLine ("splited string is:" {0} ", length: {1}", m. ToString (), m. Length );
}
Console. WriteLine ("splited count: {0}", result. Length );
}
The code is simple and not much to explain. Let's look at the result with a smaple:
SplitMatch (@ "/", RegexOptions. None, "2004/4/25 ");
Output:
////////////////----------------------------------////////////////
String: "2004/4/25", expression: "/", split result is:
Splited string is: "2004", length: 4
Splited string is: "4", length: 1
Splited string is: "25", length: 2
Splited count: 3
The purpose of this article is very simple: introduce several major features of Regex (matching, replacement, and splitting), and provide several simple and convenient test functions. This allows you to test whether your understanding of regular expressions is accurate.
For example, to confirm the role of ^ $, you can put the (input, expression) data as follows:
("123", "^ \ d + $") ("123aaa456", "^ \ d +") ("123aaa456", "123 &")
Check the function of \ d, \ s, \ w, \ W. You can test it as follows:
("123abc gc 456", "\ d +") ("123abc gc 456", "\ s + ")
("123abc gc 456", "\ w +") ("123abc gc 456", "\ W + ")
Comparison? + * You can use the following data for the difference:
("A123 abcd", "a \ d ?") ("A123 abcd", "a \ d +") ("a123 abcd", "a \ d *")