What is a regular expression
Regular expressions are powerful tools for verifying and manipulating strings. A simple understanding of regular expressions can be thought of as a special validation string. The common use of regular expressions is to verify the user input information format, such as the above group of "\w{1,}@\w{1,}\.\w{1", in fact, is to verify that the email address is legitimate, of course, the regular expression is not only for validation, it can be said that the use of strings can be used wherever the regular expression;
The basic classes involved
Regular expressions are written in English (Regular expression), based on the use of regular expressions and the meaning of words. NET to set its namespace to System.Text.RegularExpressions;
8 Basic classes are included within the namespace: Capture, CaptureCollection, Group, GroupCollection, Match, MatchCollection, Regex and RegexCompilationInfo1 are shown;
Figure 1 The MSDN library expression namespace
Capture
Capturing results for a single expression
CaptureCollection
For a sequence of string captures
Group
Represents the result of a single capture
GroupCollection
Represents a rally for a capturing group
Match
Represents the result of matching a single regular expression
MatchCollection
Represents the application of a regular expression to a string in an iterative way
Regex
A regular expression that represents an immutable
RegexCompilationInfo
The compiled regular expression needs to provide information
Note
This article is a beginner regular expression of the introductory article, for the Advanced Grouping (group) and its related syntax, etc. do not introduce here;
Basic knowledge of regular expressions
A set of your own grammar rules in regular expressions, including character matching, repeat matching, character positioning, escape matching, and other advanced syntax (character grouping, character substitution, and character decision);
Character matching syntax:
Character syntax
Grammatical explanations
Syntax examples
\d
Matching numbers (0~9)
' \d ' matches 8, does not match 12;
\d
Match non-numeric
' \d ' matches C, does not match 3;
\w
Match any single character
' \w\w ' matches A3, does not match @3;
\w
Match Non-single character
' \w ' matches @, does not match C;
\s
Match white space characters
' \d\s\d ' matches 3 D, does not match ABC;
\s
Match non-null characters
' \s\s\s ' matches a#4, mismatch 3 D;
.
Match any character
' ... ' matches a$ 5, does not match line breaks;
[...]
Match any character in parentheses
[b-d] matches B, C, D, does not match E;
[^ ...]
Match non-bracket characters
[^b-z] matches a, does not match the character of B-z;
Repeating match syntax:
repeating syntax
Grammatical explanations
Syntax examples
N
Match N-Times characters
\d{3} matches \d\d\d, does not match \d\d or \d\d\d\d
{N,}
Matches n times and more than n times
\W{2} matches \w\w and \w\w\w above and does not match \w
{N,m}
Match n times on M times down
\s{1,3} matches \s,\s\s,\s\s\s, mismatch \s\s\s\s
?
Match 0 or 1 times
5. Match 5 or 0, mismatch not 5 and 0
+
Match one or more times
\s+ matches more than one \s, does not match more than one \s
*
Match more than 0 times
\w* matches more than 0 \w, does not match non-n*\w
Character positioning Syntax:
repeating syntax
Grammatical explanations
Syntax examples
^
Locate the back mode start position
$
The front pattern is at the end of the string
\a
Front mode start position
\z
Previous mode End position
\z
Previous mode end position (before line break)
\b
Match a word boundary
\b
Match a non-word boundary
Escape matching syntax:
Escape syntax
Characters involved (syntax explanation)
Syntax examples
"\" + Actual character
\ . * + ? | ( ) { }^ $
For example: \ \ matches the character "\"
Match line break
Match carriage return
\ t
Match Horizontal Tabs
\v
Match vertical Tabs
\f
Match a page break
Nn
Match a 8 binary ASCII
\xnn
Match a 16 binary ASCII
\unnnn
Match 4 x 16 binary Uniode
\c+ Capital Letters
Match Ctrl-Caps
Example: \cs-matching Ctrl+s
- The method of constructing regular expression
Constructing regular expressions involves the Regex class, which includes: IsMatch (), Replace (), Split (), and match classes in the Regex class;
(1) IsMatch () method;
The IsMatch () method is actually a return bool-worthy method, if the test character satisfies the regular expression return true otherwise returns false;
Example 1; Judge the number of non-Chengdu legal;
Analysis: Chengdu Telephone Number composition 028********, preceded by fixed area code 028, followed by 8 digits;
Design regular expression: 028\d{8} (Explanation: 028 area code fixed, followed by 8 digital \d);
The program code, 2 shows:
Figure 2 "Example 1" IsMatch method is a use case
(2) Replace () method;
The replace () method is actually a replacement method, replacing a matching regular expression matching pattern;
Example 2: When publishing an article with a public e-mail address, replace the @ bit at to avoid spam;
Analysis: First you need to determine the e-mail address in the article, and then perform the replacement
Design Regular expression: Judge the e-mail expression "\w{1,}@w{1,}\\.";
Program code: 3 is shown;
Figure 3 "Example 2" the Replace method is a use case
(3) Split () method;
The split () method is actually a split method, which is stored in a string array based on a matching regular expression;
Example 3: Read all email addresses from a mass mailing address;
Analysis: Mass mailing using ";" As a separator, you need to pass the ";" To split
Program code: 4 is shown;
Figure 4 "Example 3" the Split method is a use case
Basic methods for building expressions
The constructor for constructing a Regex object consists of two overloads, one with no parameters, and the other as a constructor that contains parameters;
- Basic Form Regex (string pattern);
- Overloaded form regex (string pattern,regexoptions);
Additions: RegexOptions are enumeration types, including ignorecase (ignoring case), reghttoleft (right-to-left), None (default), Cultureinvariant (Ignore region), Multline (multiline mode) and Singleline (single-line mode);
Example 4, establish a valid ISBN verification format;
Analysis: The ISBN format is x-xxxxx-xxx-x;
Regular expression format: \d-\d{5}-\d{3}-\d
Constructs the regular expression function regex Isbnregex = new regex (expression, null argument)
Detail code: 5 is shown;
Figure 5 "Example 4" constructs a validation function that is a use case
Write an inspection procedure
To make it easier for you to learn regular expressions and to quickly verify that you are writing expression statements correctly, a IsMatch () method is provided below to write the regular expression validator;
- Open vs.net, select the Windows application for the Visual C # project in the new project, named "Regex_tools";
- Then write the interface shown in 6
Figure 6 Regular Expression IsMatch method validator
- The regular expression namespace declaration using System.Text.RegularExpressions is then added to the form declaration;
- Write the following code
- Compiling the program, a simple regular expression validator is successfully generated;
Comprehensive analysis of C # regular expressions:
So far, many programming languages and tools have included support for regular expressions, of course. NET is no exception. The net base Class library contains a namespace and a series of classes that give full play to the power of regular expressions.
The knowledge of regular expressions is probably the most disturbing thing for many programmers. If you do not have the knowledge of regular expressions, it is advisable to start with the basics of regular expressions. Before you seeRegular Expression Syntax。
Here's a look at regular expressions in C #, where regular expressions in C # are included. NET base Recou a namespace, this namespace is System.Text.RegularExpressions. The namespace consists of 8 classes, 1 enumerations, and 1 delegates. They were:
Capture: Contains the result of a match;
sequence of capturecollection:capture;
Group: The result of a set of records, which is inherited from capture;
GroupCollection: Represents a collection of capturing groups
Match: The matching result of an expression is inherited by group;
A sequence of matchcollection:match;
MatchEvaluator: The delegate used when performing the replace operation;
Regex: An instance of the compiled expression.
RegexCompilationInfo: Provides information that the compiler uses to compile a regular expression into a stand-alone assembly
RegexOptions provides an enumeration value to set the regular expression
The Regex class also contains some static methods:
Escape: Escapes the escape character in a regex in a string;
IsMatch: If an expression matches in a string, the method returns a Boolean value;
Match: Returns an instance of match;
Matches: Returns a series of match methods;
Replace: Replaces the matching expression with a replacement string;
Split: Returns a series of strings determined by an expression;
Unescape: Escape character in string is not escaped.
Here are some of their uses:
Looking at a simple matching example first, we start with a simple expression that uses the Regex and match classes. Match m = Regex.match ("Abracadabra", "(a|b|r) +"); We now have an instance of the match class that can be used for testing, for example: if (m.success) {}, if you want to use a matching string, you can convert it to a string: Mesaagebox.show ("match=" +m.tostring ()); This example gives the following output: Match=abra. This is the matching string.
The Regex class represents a read-only regular expression class. It also contains a variety of static methods (described in the following example) that allow other regular expression classes to be used without explicitly creating an instance of another class.
The following code example creates an instance of the Regex class and defines a simple regular expression when the object is initialized. Declare a Regex object variable: Regex objalphapatt, and then create an instance of the Regex object and define its rules: Objalphapatt=new regex ("[^a-za-z]");
The IsMatch method indicates whether the regular expression specified in the Regex constructor finds a match in the input string. This is one of the most common methods we use when using C # regular expressions. The following example illustrates the use of the IsMatch method:
if (!objalphapatt.ismatch ("Testismatchmethod"))
Lblmsg.text = "Match succeeded";
Else
Lblmsg.text = "Match not successful";
The result of this code execution is "match succeeded"
if (! Objalphapatt.ismatch ("testisMatchMethod7654298"))
Lblmsg.text = "Match succeeded";
Else
Lblmsg.text = "Match not successful";
The result of this code execution is "match not successful"
The escape method means that the escape character is used as the character itself, and no longer has the escape effect, with the smallest tuple character set (\, *, + 、?、 |, {, [, (,), ^, $ 、.、 #, and white space). The Replace method replaces all occurrences of the character pattern defined by the regular expression with the specified replacement string. Look at the following example, or use the Regex object defined above: Objalphapatt.replace ("This [test] * * Replace and Escape", Regex.escape ("()"); His return result is: This\ ( \) \ (\) test\ (\) \ (\) \ (\) \ (\) \ (\) replace\ (\) and\ (\) Escape, if not escape, the result is: this () () () () () () () () () () () () and ( Escape,unescape reverses the conversion performed by escape, but escape cannot completely reverse the unescape.
The split method splits the input string into a substring array by the location defined by the regular expression match. For example:
Regex r = new Regex ("-"); Split on hyphens.
String[] s = r.split ("First-second-third");
for (int i=0;i<s.length;i++)
{
Response.Write (s[i]+ "<br>");
}
The result of the execution is:
First
Second
Third
Looks like the split method of string, but the split method of string splits the string at a delimiter determined by a regular expression rather than a set of characters.
The match method searches the input string for a match for a regular expression, and the match method of the Regex class returns the Match object, which represents the result of the regular expression match operation. The following example shows the use of the match method and returns the group object using the Match object's Group property:
string text = @ "public string Testmatchobj string s string Match";
String Pat = @ "(\w+) \s+ (string)";
Compile the regular expression.
Regex r = new Regex (PAT, regexoptions.ignorecase);
Match the regular expression pattern against a text string.
Match m = r.match (text);
int matchcount = 0;
while (m.success)
{
Response.Write ("Match" + (++matchcount) + "<br>");
for (int i = 1; I <= 2; i++)
{
Group g = m.groups[i];
Response.Write ("Group" +i+ "= '" + G + "'" + "<br>");
capturecollection cc = g.captures;
for (int j = 0; J < cc. Count; J + +)
{
Capture C = cc[j];
Response.Write ("Capture" +j+ "= '" + C + "', position=" +c.index + "<br>");
}
}
m = M.nextmatch ();
}
The result of this example operation is:
Match1
group1= ' Public '
capture0= ' public ', position=0
Group2= ' String '
Capture0= ' string ', position=7
Match2
group1= ' Testmatchobj '
Capture0= ' Testmatchobj ', position=14
Group2= ' String '
Capture0= ' string ', position=27
Match3
Group1= ' s '
Capture0= ' s ', position=34
Group2= ' String '
Capture0= ' string ', position=36
The MatchCollection class represents a read-only collection of successful non-overlapping matches, and the instance of MatchCollection is returned by the Regex.Matches property, which is illustrated by finding all matches specified in the Regex in the input string and populating MatchCollection.
MatchCollection MC;
Regex r = new Regex ("match");
MC = r.matches ("Matchcollectionregexmatchs");
for (int i = 0; I < MC. Count; i++)
{
Response.Write (Mc[i]. Value + "POS:" + mc[i]. Index.tostring () + "<br>");
}
The result of running the instance is:
Match pos:0
Match pos:20
How C # uses the regular expression ZZ