The Regex class represents a regular expression that is not mutable (read-only). It also contains a variety of static methods that allow you to use other regular expression classes without explicitly creating instances of other classes.
Overview of Regular Expression basics
What is a regular expression
When writing a handler for a string, there is often a need to find strings that meet certain complex rules. Regular expressions are the tools used to describe these rules. In other words, regular expressions are code that records text rules.
Often, we use the wildcard character (* and?) when we use Windows to find files. If you want to find all the Word documents in a directory, you can use *.doc to find them, where the * is interpreted as any string. Like wildcard characters, regular expressions are also tools for text matching, but they describe your needs more precisely than wildcards--and, of course, the cost is more complex.
A. C # regular expression symbol pattern
Characters |
Description |
\ |
An escape character that escape a character with a special function as a normal, or vice versa |
^ |
Matches the start position of the input string |
$ |
Matches the end position of the input string |
* |
Match the previous 0 or more child expressions |
+ |
Match one or more of the previous child expressions |
? |
Match the previous 0 or one subexpression |
{n} |
N is a nonnegative integer that matches the previous N second-son expression |
{n,} |
N is a nonnegative integer that matches at least the first n second-son expression |
{n,m} |
m and n are non-negative integers, where n<=m, minimum matches n times and up to m Times |
? |
When the character is immediately following the other qualifiers (*,+,?,{n},{n,},{n,m}), match the pattern as little as possible to match the string being searched |
. |
Matches any single character except "\ n" |
(pattern) |
Match pattern and get this match |
(?:pattern) |
Match pattern but do not get match results |
(? =pattern) |
Forward lookup, matching the find string at the beginning of any string matching pattern |
(?! pattern) |
Negative lookup, matching the find string at the beginning of any mismatched pattern string |
x| y |
Match x or y. For example, ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches "Zood" or "food" |
[XYZ] |
Character set combination. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain ' |
[^XYZ] |
Negative character set combination. Matches any characters that are not included. For example, ' [^ABC] ' can match ' P ' in ' plain ' |
[A-Z] |
Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the range ' a ' to ' Z ' |
[^ A-Z] |
Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character not in ' a ' ~ ' Z ' |
\b |
Matches a word boundary, which refers to the position of the word and the space between the spaces |
\b |
Matching non-word boundaries |
\d |
Matches a numeric character, equivalent to [0-9] |
\d |
Matches a non-numeric character, equivalent to [^0-9] |
\f |
Match a page feed character |
\ n |
Match a line feed |
\ r |
Match a return character |
\s |
Matches any white space character, including spaces, tabs, page breaks, and so on |
\s |
Match any non-white-space character |
\ t |
Match a tab |
\v |
Matches a vertical tab. Equivalent to \x0b and \ck |
\w |
Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] ' |
\w |
Matches any non word character. Equivalent to ' [^a-za-z0-9_] ' |
Description
Because in the regular expression "\", "?" "," * "," ^ "," $ "," + "," (",") "," | "," {"," ["etc. characters already have a certain significance, if you need to use their original meaning, it should be escaped, for example, if you want to have at least one" \ "in the string, then the regular expression should write: \\+.
Second, in C #, to use a regular expression class, add the following statement at the beginning of the source file:
Copy Code code as follows:
Using System.Text.RegularExpressions;
Three, the Regex class commonly used method
1. Static Match method
Using the static Match method, you can get a contiguous substring of the first matching pattern in the source.
The static Match method has 2 overloads, respectively
Regex.match (string input, string pattern);
Regex.match (string input, string pattern, regexoptions options);
Parameter representation of the first Overload: input, pattern
The second overloaded parameter represents the "bitwise OR" combination of input, pattern, RegexOptions enumeration.
Valid values for the RegexOptions enumeration are:
Complied indicates that this pattern is compiled
Cultureinvariant says no consideration of cultural background
ECMAScript represents ECMAScript, this value can only be used with ignorecase, Multiline, complied
Explicitcapture indicates that only explicitly named groups are saved
IgnoreCase indicates that the case of the input is not distinguished
Ignorepatternwhitespace indicates that non-escape whitespace is removed from the pattern and that annotations marked by # are enabled
Multiline represents a multiline pattern, changing the meaning of the meta characters ^ and $, which can match the beginning and end of a line
None indicates no setting, this enumeration item has no meaning
RightToLeft represents a right-to-left scan and a match, at which point the static match method returns the first right-to-left match
Singleline represents a single-line pattern, changing the meta character. meaning that it can match line breaks
Note: Multiline can be used with singleline in the absence of ECMAScript. Singleline and Multiline are not mutually exclusive, but are mutually exclusive with ECMAScript.
2. Static Matches method
The overloaded form of this method is the same as the static Match method, which returns a matchcollection that represents the set of matching patterns in the input.
3. Static IsMatch method
This method returns a bool, overloaded form with static matches, returns true if the match pattern in the input, or false.
Can be understood as: IsMatch method that returns whether the collection returned by the matches method is empty.
Iv. Examples of regex classes
1. String replacement
For example, I want to modify the name value in the following format record to Wang
String Line= "Addr=1234;name=zhang; phone=6789 ";
Regex reg = new Regex ("name= (. +);");
String modified = Reg. Replace (line, "Name=wang;");
The modified string is Addr=1234;name=wang; phone=6789
2. String matching
For example, I want to extract the name value from that record.
Regex reg = new Regex ("name= (. +);");
Match Match=reg. Match (line);
String Value=match. GROUPS[1]. Value;
3, match example 3
The text contains "speed=30.2mph", the speed value needs to be extracted, but the unit of speed may be metric or imperial, mph,km/h,m/s is possible;
String line= "lane=1;speed=30.3mph;acceleration=2.5mph/s";
Regex reg=new Regex (@ "speed\s*=\s*" [\d\.] +) \s* (mph|km/h|m/s) * ");
Match Match=reg. Match (line);
Then match in the returned result. GROUPS[1]. Value will contain numeric values, and match. GROUPS[2]. Value will contain units.
4, another example, decoding the GPS GPRMC string, just
Regex reg = new Regex (@ "^\ $GPRMC, [\d\.] *,[a| V], (-?[ 0-9]*\.? [0-9]+), ([ns]*), (-?[ 0-9]*\.? [0-9]+), ([ew]*),. * ");
You can get a longitude, latitude value, which requires dozens of lines of code before.
V. Description of the System.Text.RegularExpressions namespace
The namespace consists of 8 classes, 1 enumerations, and 1 delegates. They were:
Capture: Contains the result of the first match;
A sequence of capturecollection:capture;
Group: The results of a set of records, inherited from capture;
GroupCollection: Represents a collection of capturing groups
Match: The result of an expression that is inherited by group;
A sequence of matchcollection:match;
MatchEvaluator: A delegate to use when performing a replacement operation;
Regex: An instance of an expression that is compiled.
RegexCompilationInfo: Provides information that the compiler uses to compile a regular expression as a standalone assembly
RegexOptions provides an enumeration value for setting a regular expression
The Regex class also contains some static methods:
Escape: Escapes the escape character in a regex in a string;
IsMatch: If an expression matches in a string, the method returns a Boolean value;
Match: Returns the example of match;
Matches: Returns a series of match methods;
Replace: Replaces a matching expression with a replacement string;
Split: Returns a series of strings determined by an expression;
Unescape: Escape characters in String are not escaped.