Regular expressions are common tools for handling strings. In C #, we typically use the Regex class to represent a regular expression. The general regular expression engine supports the following 3 matching modes: Single-line mode (singleline), multiline mode (Multiline), and ignore case (IgnoreCase).
1. Single Mode (singleline)
MSDN Definition: Change the meaning of a point (.) so that it matches each character (instead of matching every character except \ n).
The typical scenario for using Single-line mode is to get information from the source of the Web page.
Example:
We used the WebBrowser control to obtain the following HTML source code from the http://www.xxx.com/1.htm, which is stored in variable str:
<body>
<div>
Line 1
Line 2
</div>
</body>
We want to extract the div tag and the contents of it, and write the following code:
String pattern = @ "<div>.*</div>";
Regex regex = new regex (pattern);
if (regex). IsMatch (str))
Console.WriteLine (regex. Match (str). Value);
Else
Console.WriteLine ("mismatch!");
The result is: mismatch!
Error Analysis:
The point sign (.) is generally considered Matches any single character, and (. *) matches any number of characters. However, the dot symbol does not match the newline character. The expression that is equivalent to it in Windows is [^\r\n].
And we get from the website of HTML source code, very few do not change line. This time the Single-line mode comes in handy, it can change the meaning of the point symbol. Modify the constructor of the Regex instance, using Regexoptions.singleline to declare the use of Single-line mode:
String pattern = @ "<div>.*</div>";
Regex regex = new Regex (pattern, regexoptions.singleline);
if (regex). IsMatch (str))
Console.WriteLine (regex. Match (str). Value);
Else
Console.WriteLine ("mismatch!");
/*
The results are:
<div>
Line 1
Line 2
</div>
*/
Inline modifiers for Single-line mode:
We can embed Single-line mode directly in a regular expression:
(? s) <div>.*</div>
(? s) modifier description, followed by a single-line pattern of the expression. So please do not put it at the end when using. You can also turn off Single-line mode using (? s).
Note: Embedding mode has precedence over the RegexOptions setting of the Regex class, so after using (? s), resolve in Single-line mode regardless of whether or not regexoptions.singleline is used.
2. Multi-line mode (Multiline)
MSDN Definition: Change the meaning of ^ and $ so that they match at the beginning and end of any line, not just the beginning and ending of the entire string.
Example:
There is a text file, each row of which is a username, which reads the file into the variable str for processing. Its contents are as follows:
24 Painting students
Terrylee
Mo Meet
Dflying Chen
Rainy
Blog Park, the name of your predecessors:
We want to find a username that starts with an English letter and writes the following code:
String pattern = @ "^[a-za-z]+.*";
Regex regex = new regex (pattern);
if (regex). IsMatch (str))
Console.WriteLine (regex. Match (str). Value);
Else
Console.WriteLine ("mismatch!");
The result is: mismatch!
Error Analysis:
(^) is the starting anchor of the string, and the first character of STR is a Chinese text, so it doesn't match. We can use multiline mode to change the meaning of (^) so that it matches the start of each row, not the beginning of the entire string.
Change the code as follows:
String pattern = @ "^[a-za-z]+.*";
Regex regex = new Regex (pattern, regexoptions.multiline);
if (regex). IsMatch (str))
Console.WriteLine (regex. Match (str). Value);
Else
Console.WriteLine ("mismatch!");
The result is: Terrylee
Also, multiline mode changes the meaning of ($) so that it matches the end of each row, not the end of the entire string.
Unlike (^) and ($), (\a) and (\z) are not affected by multiline mode, which always matches the beginning and end of the entire string.
Inline modifiers for multiline mode: (? m) and (
3. Ignore case (IgnoreCase)
MSDN Definition: Specifies a case-insensitive match.
This pattern is easy to understand, and it considers uppercase and lowercase characters to be the same. We still have the above examples to illustrate.
Example:
String pattern = @ "^[a-z]+.*";
Regex regex = new Regex (Pattern, Regexoptions.multiline | Regexoptions.ignorecase);
if (regex). IsMatch (str))
Console.WriteLine (regex. Match (str). Value);
Else
Console.WriteLine ("mismatch!");
The result is: Terrylee
Analysis: Note that the regular expressions used this time are not written in uppercase letters, but they match the names that start with an uppercase letter, which is the effect of ignoring the case.
Embed modifier for case insensitive: (? i) and (? I )
Summarize:
Finally, we use a table to summarize the three patterns
Defines an expression that affects an RegexOptions enumeration embedded identifier
Single-line mode changes the meaning of the point (.) so that it matches each character (instead of matching every character except \ n) ... Singleline (? s)
Multiline mode changes the meaning of ^ and $ so that they match at the beginning and end of any line, not just at the beginning and ending of the entire string. ^$ Multiline (? m)
Ignoring case specifies a case-insensitive match. IgnoreCase (? i)