Regular expressions are a common tool for working with strings. In C #, we typically use the Regex class to represent a regular expression. The general regular expression engine supports the following 3 matching modes: single-line mode (singleline), multiline mode (multiline), and ignore case (ignorecase).
1. Single-line mode (Singleline)
MSDN Definition: Change the meaning of a point (.) so that it matches each character (rather than matching every character except \ n).
The typical scenario for using single-line mode is to obtain information from the source of the Web page.
Example:
We used the WebBrowser control to get the following HTML source from the http://www.xxx.com/1.htm, which is stored in the variable str:
<body>
<div>
Line 1
Line 2
</div>
</body>
We want to extract the div tag and its contents and write the following code:
String pattern = @ "<div>.*</div>";
Regex regex = new regex (pattern);
if (Regex.IsMatch (str))
Console.WriteLine (Regex.match (str). value);
Else
Console.WriteLine ("mismatch!");
The result is: mismatch!
Error Analysis:
The point symbol (.) is generally considered Matches any single character, and (. *) matches any number of characters. However, the point symbol does not match the line break. The equivalent expression in Windows is [^\r\n].
And we get from the site of the HTML source code, very few do not change the line. This time the single-line mode comes in handy, it can change the meaning of the dot symbol. Modify the constructor of the Regex instance, using Regexoptions.singleline to declare the use of single-line mode:
String pattern = @ "<div>.*</div>";
Regex regex = new Regex (pattern, regexoptions.singleline);
if (Regex.IsMatch (str))
Console.WriteLine (Regex.match (str). value);
Else
Console.WriteLine ("mismatch!");
/*
The result is:
<div>
Line 1
Line 2
</div>
*/
Inline modifiers for single-line mode:
We can embed a single-line pattern directly in the regular expression:
(? s) <div>.*</div>
The (? s) modifier indicates that the expression following it is in single-line mode. So please do not put it at the end when using. You can also use (?-s) to turn off single-line mode.
Note: The embedding mode takes precedence over the RegexOptions setting of the Regex class, so when (? s) is used, it is resolved in single-line mode, whether or not the regexoptions.singleline is used.
2. Multiline mode (multiline)
MSDN Definition: Change the meaning of ^ and $ so that they match at the beginning and end of each line, not just at the beginning and ending of the entire string.
Example:
There is a text file in which each row is a user name and the file is read into the variable str for processing. The contents are as follows:
24 Draw Students
Terrylee
Mo Meet
Dflying Chen
Rainy
Borrow the name of the senior members of the blog Park:)
We want to find a user name that starts with the English alphabet and writes the following code:
String pattern = @ "^[a-za-z]+.*";
Regex regex = new regex (pattern);
if (Regex.IsMatch (str))
Console.WriteLine (Regex.match (str). value);
Else
Console.WriteLine ("mismatch!");
The result is: mismatch!
Error Analysis:
(^) is the beginning of the string anchor, the first character of STR is a Chinese text, so the match is not on. We can use multiline mode to change the meaning of (^) so that it matches the beginning of each line rather than the beginning of the entire string.
The change code is as follows:
String pattern = @ "^[a-za-z]+.*";
Regex regex = new Regex (pattern, regexoptions.multiline);
if (Regex.IsMatch (str))
Console.WriteLine (Regex.match (str). value);
Else
Console.WriteLine ("mismatch!");
The result is: Terrylee
Also, multiline mode changes the meaning of ($) so that it matches the end of each line, not the end of the entire string.
Unlike (^) and ($), (\a) and (\z) are not affected by multiline mode, and always match the beginning and end of the entire string.
Inline modifiers for multiline mode: (? m) and (?-m)
3. Ignore case (ignorecase)
MSDN Definition: Specifies a case-insensitive match.
This pattern is easy to understand and it is believed that uppercase and lowercase characters are the same. We still have the above examples to illustrate.
Example:
String pattern = @ "^[a-z]+.*";
Regex regex = new Regex (Pattern, regexoptions.multiline | regexoptions.ignorecase);
if (Regex.IsMatch (str))
Console.WriteLine (Regex.match (str). value);
Else
Console.WriteLine ("mismatch!");
The result is: Terrylee
Analysis: Note that the regular expression used this time, we did not write uppercase letters, but matched the first name with a capital letter, this is the effect of ignoring the case.
Ignore case embedding modifier: (? i) with (?-i)
Summarize:
Finally we use a table to summarize these three patterns
Defines an expression that affects RegexOptions enumeration embedded identifiers
The single-line mode changes the meaning of the point (.) so that it matches each character (instead of matching every character except \ n) ... Singleline (? s)
Multiline mode changes the meaning of ^ and $ so that they match at the beginning and end of each line, not just at the beginning and ending of the entire string. ^$ multiline (? m)
Ignoring case specifies a case-insensitive match. IgnoreCase (? i)
3 matching patterns for regular expressions