First, we start by examining the string Pat, which contains
Expression . The first capture starts with the first parentheses, and then the expression matches with an Abra. The second capture group starts from the second parentheses, but the first capture group is not over yet. This means that the first group matches abracad, the matching result of the second group is only CAD. Therefore, if you use? To make CAD an optional match, the matching result may be Abra or abracad. Then, the first group ends and the expression is required to be matched multiple times by specifying the + symbol.
Now let's take a look at what happens in the matching process. First, call the constructor method of RegEx to create an instance of the expression and specify various options. In this example, because there is a comment in the expression, the X option is selected, and some null cells are also used. When the X option is enabled, the expression ignores comments and spaces without escape.
Then, retrieve the list of group numbers defined in the expression. Of course, you can use these numbers explicitly. Programming . If a named group is used, this method is also very effective as a way to create a fast index.
The next step is to complete the first matching. Use a loop to test whether the current matching is successful. Next, repeat this operation on the group list from Group 1. In this example, group 0 is not used because group 0 is a fully matched string. To collect all matched strings as a single string, group 0 is used.
We track the capturecollection in each group. Normally, each group can have only one capture, but group1 in this example has two capture: capture0 and capture1. If you only need the tostring of group1, you will get only abra. Of course, it will also match abracad. The value of tostring in the group is the value of the last capture in its capturecollection, which is exactly what we need. If you want the entire process to end after matching Abra, you should delete the + symbol from the expression to let the RegEx engine know that we only need to match the expression.
Comparison between process-based and expression-based methods
Generally, users who use rule expressions can be divided into the following two categories: the first type of users should try not to use rule expressions, but use procedures to perform operations that need to be repeated; the second type of users make full use of the functions and power of the Rule Expression Processing Engine, and use the process as little as possible.
For most of our users, the best solution is to use both of them. I hope this article will illustrate. net Language The role of the Regexp class and its advantages and disadvantages between performance and complexity.
Process-based model
We often need to use a function in programming to match a part of a string or process other strings. Below is an example of matching words in a string:
String text = "the quick red fox jumped over the lazy brown dog .";
System. Console. writeline ("text = [" + TEXT + "]");
String result = "";
String Pattern = @ "/W + |/W + ";
Foreach (Match m in RegEx. Matches (text, pattern ))
{
// Obtain the matched string
String x = M. tostring ();
// If the first character is lowercase
If (char. islower (X [0])
// Convert to uppercase
X = Char. toupper (X [0]) + X. substring (1, x. Length-1 );
// Collect all characters
Result + = X;
}
System. Console. writeline ("result = [" + Result + "]");
As shown in the preceding example Use The foreach statement in C # Language processes each matching character and completes corresponding processing. In this example, a new result string is created. The output of this example is as follows:
TEXT = [the quick red fox jumped over the lazy brown dog.]
Result = [the quick red fox jumped over the lazy brown dog.]
Expression-based mode
Another way to complete the functions in the above example is through a matchevaluator. The new Code is as follows:
Static string captext (Match m)
{
// Obtain the matched string
String x = M. tostring ();
// If the first character is lowercase
If (char. islower (X [0])
// Convert to uppercase
Return Char. toupper (X [0]) + X. substring (1, x. Length-1 );
Return X;
}
Static void main ()
{
String text = "the quick red fox jumped over
Lazy brown dog .";
System. Console. writeline ("text = [" + TEXT + "]");
String Pattern = @ "/W + ";
String result = RegEx. Replace (text, pattern,
New matchevaluator (test. captext ));
System. Console. writeline ("result = [" + Result + "]");
}
At the same time, it is important to note that this mode is very simple because you only need to modify words without modifying non-words.