In-depth analysis of reges. Match

Source: Internet
Author: User
Tags cdata

 

When I read the code written by my predecessors, I saw a regular expression written by my predecessors. After several debugging times, I didn't understand RegEx. March. Now I can check the materials and learn the basics!

This is the value and rule I want to match

You can judge

1. var CDATA = "^ $ = No ";

2. var CDATA = ". + = yes ";

Based on two different CDATA types, can you see the output results? If you can, you don't need to read them down, which means you have mastered them.

VaR SV = "wei ";

String ptnconfigdata = "^ (? <Srcdataptn> .*?) = (? <Duiyingdata> .*?) $ ";

Match mchcdata = RegEx. Match (CDATA, ptnconfigdata );
If (mchcdata. value = "") continue;
If (RegEx. ismatch (SV, mchcdata. Groups ["srcdataptn"]. Value ))
{
Duiyingsrcvalue = mchcdata. Groups ["duiyingdata"]. value;
Break;
}

The output result is duiyingsrcvalue: "Yes"

 

The RegEx class represents an unchangeable (read-only) regular expression. It also contains various static methods that allow other regular expression classes to be used without explicitly creating instances of other classes.

Basic Introduction to Regular Expressions

What is a regular expression?

When writing a string processing program, it is often necessary to find strings that conform to certain complex rules. Regular Expressions are tools used to describe these rules. In other words, a regular expression is the code that records text rules.

Generally, wildcards (* And?) are used to search for files in windows ?). If you want to find all the Word documents in a directory, you can use *. Doc to search. Here, * is interpreted as any string. Like wildcards, regular expressions are also a tool for text matching, but they can more accurately describe your needs than wildcards-of course, the cost is more complex.

I. C # Regular Expression symbol Mode

 

Character

Description

\

Escape Character, escape a character with special functions into a common character, or vice versa

^

Match the start position of the input string

$

End position of matching input string

*

Matches the previous zero or multiple subexpressions.

+

Match the previous subexpression one or more times

?

Matches the previous zero or one subexpression.

{N}

N is a non-negative integer that matches the previous n subexpressions.

{N ,}

N is a non-negative integer that matches at least the previous n subexpressions.

{N, m}

Both m and n are non-negative integers, where n <= m, at least N times and at most m times

?

When this character is followed by other delimiters (*, + ,?, {N}, {n ,},{ n, m}), the matching mode matches as few strings as possible.

.

Match any single character except "\ n"

(Pattern)

Match pattern and obtain this match

(? : Pattern)

Matches pattern but does not get the matching result.

(? = Pattern)

Forward pre-query: matches the search string at the beginning of any string that matches the pattern.

(?! Pattern)

Negative pre-query: matches the search string at the beginning of any string that does not match Pattern

X | y

Match X or Y. For example, 'z | food' can match "Z" or "food ". '(Z | f) Ood' matches "zood" or "food"

[Xyz]

Character Set combination. Match any character in it. For example, '[ABC]' can match 'A' in "plain'

[^ XYZ]

Negative value character set combination. Match any character not included. For example, '[^ ABC]' can match 'p' in "plain'

[A-Z]

Matches any character in the specified range. For example, '[A-Z]' can match any lowercase letter in the range of 'A' to 'Z '.

[^ A-Z]

Match any character that is not within the specified range. For example, '[^ A-Z]' can match a value other than 'A '~ Any character in 'z''

\ B

Match A Word boundary. It refers to the position between words and spaces.

\ B

Match non-word boundary

\ D

Matches a number, which is equivalent to [0-9].

\ D

Match a non-numeric character, equivalent to [^ 0-9]

\ F

Match a newline

\ N

Match A linefeed

\ R

Match a carriage return.

\ S

Matches any blank characters, including spaces, tabs, and page breaks.

 

\ S

Match any non-blank characters

\ T

Match a tab

\ V

Match a vertical tab. Equivalent to \ x0b and \ CK

\ W

Match any word characters that contain underscores. Equivalent to ''[A-Za-z0-9 _]'

\ W

Match any non-word characters. Equivalent to '[^ A-Za-z0-9 _]'

 

Note:

In the regular expression, "\", "? "," * "," ^ "," $ "," + "," (",") "," | "," {"," [", And other characters have some special significance. If you need to use their original meanings, escape them, for example, if you want to have at least one "\" in the string, the regular expression should be written as follows: \ +.

2. To use a regular expression class in C #, add the following statement at the beginning of the source file:

 


Using system. Text. regularexpressions;

 

 

Iii. Common RegEx Methods

1. Static Match Method

The static match method is used to obtain the continuous substring of the first matching mode in the source.

The static match method has two reloads:

RegEx. Match (string input, string pattern );
RegEx. Match (string input, string pattern, regexoptions options );

Input and mode parameters of the first type of Overload

The second type of overload parameter represents the input, mode, and regexoptions enumerated "by bit or" combination.

The valid values of regexoptions enumeration are:
Complied indicates compiling this mode
Cultureinvariant indicates that the cultural background is not taken into account.
Ecmascript indicates that the value meets ecmascript. This value can only be used with ignorecase, multiline, and complied.
Explicitcapture indicates that only explicitly-named groups are saved.
Ignorecase indicates that the input is case insensitive.
Ignorepatternwhitespace indicates removing non-escape spaces in the mode and enabling the annotation marked #
Multiline indicates the multiline mode and changes the meanings of metacharacters ^ and $. They can match the beginning and end of a row.
None indicates no setting. This enumeration item is meaningless.
Righttoleft indicates scanning and matching from right to left. In this case, the static match method returns the first matching from right to left.
Singleline indicates the single line mode, which changes the meaning of metacharacters. It can match line breaks.

Note: multiline can be used with singleline without ecmascript. Singleline and Multiline are not mutually exclusive, but they are mutually exclusive with ecmascript.

2. Static matches Method

This method is reloaded in the same way as the static match method. A matchcollection is returned, indicating the set of matching modes in the input.

3. Static ismatch Method

This method returns a bool. The reload format is the same as the static matches. If the input matches the pattern, true is returned. Otherwise, false is returned.
It can be understood as: ismatch method, whether the set returned by the return matches method is null.

Iv. RegEx instances

1. String replacement

For example, I want to change the name value in the following format record to Wang

String line = "ADDR = 1234; name = Zhang; phone = 6789 ";
RegEx Reg = new RegEx ("name = (. + );");
String modified = reg. Replace (line, "name = Wang ;");

The modified string is ADDR = 1234; name = Wang; phone = 6789.

2. String Matching

For example, I want to extract the name value from the record just now.

RegEx Reg = new RegEx ("name = (. + );");
Match match = reg. Match (line );
String value = match. Groups [1]. value;

3. Match instance 3

The text contains "speed = 30.2mph". You need to extract the speed value. However, the unit of speed may be either in the metric or in the Imperial, MPH, km/h, or M/s; in addition, there may be spaces before and after.

String line = "Lane = 1; speed = 30.3mph; acceleration = 2.5mph/s ";
RegEx Reg = new RegEx (@ "speed \ s * = \ s * ([\ D \.] +) \ s * (MPH | km/h | m/s )*");
Match match = reg. Match (line );

In the returned results, match. Groups [1]. value will contain numbers, while match. Groups [2]. value will contain units.

4. For another example, to decode the gps gprs string, you only need

RegEx Reg = new RegEx (@ "^ \ $ uplmc, [\ D \.] *, [A | V], (-? [0-9] * \.? [0-9] +), ([ns] *), (-? [0-9] * \.? [0-9] +), ([EW] *),. * ");

You can obtain the longitude and latitude values, and dozens of lines of code are required before.

V. Description of the namespace of system. Text. regularexpressions

The namespace contains eight classes, one enumeration, and one delegate. They are:

Capture: contains a matching result;
Capturecollection: the sequence of capture;
Group: the result of a group record, inherited by capture;
Groupcollection: a collection of capture groups.
Match: the matching result of an expression, inherited by the Group;
Matchcollection: a sequence of match;
Matchevaluator: The delegate used to perform the replacement operation;
RegEx: An Example of the compiled expression.
Regexcompilationinfo: provides information that the compiler uses to compile a regular expression into an independent assembly.
Regexoptions provides the enumerated values used to set regular expressions.
The RegEx class also contains some static methods:
Escape: escape the escape characters in the RegEx string;
Ismatch: If the expression matches a string, this method returns a Boolean value;
Match: returns the instance of the match;
Matches: returns a series of match methods;
Replace: Replace the matching expression with the replacement string;
Split: returns a series of strings determined by expressions;
Unescape: do not escape characters in strings.

In-depth analysis of reges. Match

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.