Before proceeding to the next step, let's take a quick look at the basic terms of regular expressions.
Capture:When you use a subexpression to match the results of a regular expression. The Capture and CaptureCollection classes indicate the results captured by a single or a group of successful subexpressions.
Group:A regular expression can be composed of one or more groups. The Group class indicates the results from a single capture Group. The GroupCollection class indicates the set of groups captured in a match.
Matching:The result of regular expression matching. The Match class indicates the matching result of a single regular expression. MatchCollection class
Indicates that the regular expression pattern is applied to the set of successful matches found in the input string in iterative mode.
Therefore, the relationships between objects related to regular expressions are:
Regex class --> MatchCollection --> Match objects --> GroupCollection --> Group objects --> CaptureCollection --> Capture objects
Regex class
The Regex class indicates the regular expression engine of the. NET Framework. It can be used to quickly analyze a large number of texts to find specific character patterns, extract, edit, replace or delete text substrings, or add extracted strings to the collection to generate reports.
You can use the Regex class in two ways, call its static method or use the instance method. There is no difference in performance. The following lists some main methods of the Regex class:
Method |
Description |
IsMatch |
IsMatch () indicates whether the specified regular expression matches the specified input string. True is returned for matching. Otherwise, false is returned. |
Match |
Match () searches the string for the first Match of the regular expression. A Match object is returned. |
Matches |
Matches () searches the specified input string for all matching items of the regular expression. |
Replace |
Use the specified replacement string to replace all strings that match a regular expression pattern. |
Split |
The input string is split according to the regular expression mode. It returns an array of strings. |
In the following sections, we will use the methods described above.
Use the Regex class for pattern matching
In this section, we will introduce the pattern matching capabilities of the Regex class. Start by creating a new console application and referencing the System. Text. RegularExpression namespace.
Using System. Text. RegularExpressions;
Use the IsMatch () method
The following example checks whether a string is a valid URL.
Static void Main (string [] args)
{
String source = args [0];
String pattern = @ "http (s )? : // ([W-] +.) + [w-] + (/[w -./? % & =] *)? ";
Bool success = Regex. IsMatch (source, pattern );
If (success)
{
Console. WriteLine ("Entered string is a valid URL! ");
}
Else
{
Console. WriteLine ("Entered string is not a valid URL! ");
}
Console. ReadLine ();
}
The Main () method receives a command line parameter, string pattern = @ "http (s )? : // ([W-] +.) + [w-] + (/[w -./? % & =] *)? "; Defines URL pattern matching. Call the static IsMatch method of Regex to verify the matching. Return the bool value to determine whether the match is successful. Finally, the message is output to the console.
You can also use the Regex class object to call the IsMatch method:
Regex ex = new Regex (pattern );
Success = ex. IsMatch (source );
Use the Match () method
The following example shows how to use the Match method:
Static void Main (string [] args)
{
String source = args [0];
String pattern = @ "http (s )? : // ([W-] +.) + [w-] + (/[w -./? % & =] *)? "; Match match = Regex. Match (source, pattern );
If (match. Success)
{
Console. WriteLine ("Entered string is a valid URL! ");
Console. WriteLine ("{0} Groups tutorial", match. Groups. Count );
For (int I = 0; I <match. Groups. Count; I ++)
{
Console. WriteLine ("Group {0} Value = {1} Status = {2 }",
I, match. Groups [I]. Value, match. Groups [I]. Success );
Console. WriteLine ("t {0} Captures", match. Groups [I]. Captures. Count );
For (int j = 0; j <match. Groups [I]. Captures. Count; j ++)
{
Console. WriteLine ("tt Capture {0} Value = {1} Found at = {2 }",
J, match. Groups [I]. Captures [j]. Value, match. Groups [I]. Captures [j]. Index );
}
}
}
Else
{
Console. WriteLine ("Entered string is not a valid URL! ");
}
Console. ReadLine ();
}
The above code uses the Match () method to execute pattern matching. The Match () method returns the object of the Match class, indicating the first occurrence of the Match. The Success attribute of the Math object tells us whether the match is successful. The for loop traverses all matching groups (GroupCollection objects ). In addition, the capture group is iterated again within the loop. Output the captured value and index position. The following figure shows an example of the appeal program.
As shown in the following figure: There are four matching groups in total. The first group value isHttps://dev.mjxy.cn
The value of the second group is s. The third group has two values: dev. And mjxy .. The value of Group 4 is /.
Use the Matches () method
The Matches () method is similar to Match (), but it returns a set of Match objects (MatchCollection ). Then, you can traverse all matching instances. The following code is used for demonstration:
Static void Main (string [] args)
{
String source = args [0];
String pattern = @ "http (s )? : // ([W-] +.) + [w-] + (/[w -./? % & =] *)? ";
Match match = Regex. Match (source, pattern );
If (match. Success)
{
Console. WriteLine ("Entered string is a valid URL! ");
Console. WriteLine ("{0} Groups", match. Groups. Count );
For (int I = 0; I <match. Groups. Count; I ++)
{
Console. WriteLine ("Group {0} Value = {1} Status = {2 }",
I, match. Groups [I]. Value, match. Groups [I]. Success );
Console. WriteLine ("t {0} Captures", match. Groups [I]. Captures. Count );
For (int j = 0; j <match. Groups [I]. Captures. Count; j ++)
{
Console. WriteLine ("tt Capture {0} Value = {1} Found at = {2 }",
J, match. Groups [I]. Captures [j]. Value, match. Groups [I]. Captures [j]. Index );
}
}
}
Else
{
Console. WriteLine ("Entered string is not a valid URL! ");
}
Console. ReadLine ();
}
Search for and replace strings
In addition to matching strings, the Regex class can also find replaceable strings. For example, you can replace the string to be matched with any content you want. Run the following code:
Static void Main (string [] args)
{
String source = args [0];
String pattern = @ "http (s )? : // ([W-] +.) + [w-] + (/[w -./? % & =] *)? ";
String result = Regex. Replace (source, pattern, "[*** URLs not allowed ***]");
Console. WriteLine (result );
Console. ReadLine ();
}
In the above code, the regular expression scans the input URL string. Then, call the Replace () method of the Regex class. The first parameter is the string you want to replace. The second parameter is the rule, and the third parameter is the replaced result string. The result type is as follows:
Split string
The Regex class allows you to use a regular expression to split a string. For example, if the date string is, use/to split the year, month, and day of the date into separate strings. See the following code example:
String strDate = "2011-12-31 ";
String [] dates = Regex. Split (strDate ,"-");
Foreach (string s in dates)
{
Console. WriteLine (s );
}
Console. ReadLine ();
The result is as follows:
Regex options
None |
Unavailable |
Use the default behavior. |
IgnoreCase |
I |
Use case-insensitive matching. |
Multiline |
M |
Use multiline mode, where ^ and $ match the beginning and end of each line (not the beginning and end of the input string ). |
Singleline |
S |
Use single-line mode, where the period (.) matches every character (instead of every character t n ). |
ExplicitCapture |
N |
Unnamed groups are not captured. The only valid capture is explicitly named or numbered ((? <Name> subexpression. |
Compiled |
Unavailable |
Compile a regular expression into an assembly. |
IgnorePatternWhitespace |
X |
Exclude reserved white spaces from the mode and enable comments after the digit sign. |
RightToLeft |
Unavailable |
Change the search direction. Search is performed from right to left rather than from left to right. |
ECMAScript |
Unavailable |
Enable ECMAScript-compliant behavior for the expression. |
CultureInvariant |
Unavailable |
Ignore regional differences in languages. For more information about fixed regions, see. |
To illustrate how to use RegexOptions enumeration, you can write the following code in the main () method and observe the differences caused by RegexOptions values.
Static void Main (string [] args)
{
String source = args [0];
Bool success1 = Regex. IsMatch (source, "hello ");
Console. WriteLine ("String found? {0} ", success1 );
Bool success2 = Regex. IsMatch (source, "hello", RegexOptions. IgnoreCase );
Console. WriteLine ("String found? {0} ", success2 );
Console. ReadLine ();
}
As you can see, the second call to the IsMatch () method uses the RegexOptions enumeration to ignore case-insensitive matching. If there is no RegexOptions, return false. Use RegexOptions. IgnoreCase to return true.
Note:
You can combine RegexOptions values to use:
Bool success2 = Regex. IsMatch (source, "hello", RegexOptions. IgnoreCase | RegexOptions. Compiled );
Performance and Best Practices
In most cases, it will perform pattern matching quickly and efficiently. However, in some cases, the regular expression engine may seem slow. In extreme cases, the response may even stop because it processes a relatively small amount of data loss for several hours or even days.