C # regular expression syntax rules

Last Update:2018-12-04 Source: Internet

Author: User

Tags character classes

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Regular Expressions usually contain literaltext and Metacharacter)

Letter text refers to a normal text such as "ABCDE" that can match any string containing "ABCDE.

Metacharacters are more flexible to use common expressions to match all strings that conform to the regular expression.

C # regular expression syntax 1. match a single character

[] -- Select a character match

Supported types: Word characters ([AE]), non-word characters ([!?,; @ # $ *]), Letter range ([A-Z]), number range ([0])

Eg. Regular Expressions can match strings

[AE] ffectaffect, Effect

(In this example, "[AE]" is a metacharacter and "ffect" is a letter or text)

Note: 1. to match a hyphen in a character class, use the hyphen as the first character.

2. A single regular expression can contain multiple character classes.

Eg. [01] [0-9]: [0-5] [0-9] [AP] m can be used to match all the time in the format of PM

^ -- Exclude certain characters (this can be expressed in [] and can also start with a string)

Eg. Regular Expressions can match strings that cannot match strings

M [^ A] tmet, MIT, M & T ...... Mat

C # regular expression syntax 2. Matching special characters

Special characters that can be used:

/T -- match the tab

/R -- match the hard carriage return

/F -- match the Page Break

/N -- match the linefeed

Description indicates the metacharacters of the character class:

. -- Match any character except/N (or any character in single line mode)

/W -- match any word character (any letter or number)

/W -- match any non-word characters (any character except letters and numbers)

/S -- match any blank characters (including spaces, line breaks, tabs, etc)

/S -- match any non-blank characters (except spaces, line breaks, tabs, and other characters)

/D -- match any number characters (0 ~ 9)

/D -- match any non-numeric characters (except 0 ~ Any character other than 9)

Character position in the string:

^ -- Match the start of a string (or the start of a multiline downlink ).

$ -- Match the end of a string, the last character before the end of a string "/N", or the end of a row in multiline mode.

/A -- match the start of a string (ignore multiline Mode)

/Z -- match the end of the string or the last character before the end of the string "/N" (ignore multiline mode ).

/Z -- match the end of the string.

/G -- match the start position of the current search.

/B -- match the boundary of a word.

/B -- match the non-boundary of a word.

Note:

1. periods (.) are particularly useful. It can be used to represent any character.

Eg. Regular Expressions can match strings

01.17.8401/17/84, 01-17-84,011784, 01.17.84

2. You can use/B to match the word boundary.

Eg. Regular Expressions can match strings that cannot match strings

/Blet/bletletter, Hamlet

3./A and/Z are very useful when ensuring that the string contains an expression rather than other content.

Eg. Determine whether the text control contains the word "sophia" without any additional characters, line breaks, or spaces.

/Asophia/Z

4. The period character (.) has a special meaning. to indicate the meaning of the letter character, add a backslash before it :/.

C # regular expression syntax 3. Matching and selecting a Character Sequence

| -- Match either

Eg. Regular Expressions can match strings

COL (o | ou) rcolor, color

Note:/B (Bill | Ted) And/bbill | Ted are different.

The latter can also match "malted" because/B metacharacters are only applied to "bill ".

C # regular expression syntax 4. Use quantifiers to match

* -- Match 0 or multiple times

+ -- Match once or multiple times

? -- Match 0 times or 1 time

{N} -- exactly match n times

{N ,}-- match at least N times

{N, m} -- match at least N times, but not more than m times

Eg. Regular Expressions can match strings

Brothers? Brother, brothers

Eg. Regular Expressions can match strings

/BP/d {3, 5}/B starts with P, followed by 3 ~ End with 5 digits

Note: You can also use the quantifiers with () to apply the quantifiers to the entire letter sequence.

Eg. Regular Expressions can match strings

()? Schoolisbeautiful. schoolisbeautiful, theschoolisbeautiful.

C # regular expression syntax 5. Recognition of regular expressions and greed

Some quantifiers are greedy. They match as many characters as possible.

For example, the quantizer * matches 0 or multiple characters. Assume that you want to match any HTML tag in the string. You may use the following regular expression:

<. *>

Existing string a <I> quantifier </I> canbe <big> greedy </big>

Result <. *> match <I> quantifier </I> canbe <big> greedy </big>.

To solve this problem, we need to use a special non-Greedy character "?" With the quantifiers. Therefore, the expression changes as follows:

<. *?>

In this way, you can match <I>, </I>, <big>, and </big> correctly.

? Can force quantifiers to match as few characters as possible ,? It can also be used in the following quantifiers:

*? -- Non-Greedy quantifiers *

+? -- Non-Greedy quantifiers +

?? -- Non-Greedy quantifiers?

{N }? -- Non-Greedy quantifiers {n}

{N ,}? -- Non-Greedy quantifiers {n ,}

{N, m }? -- Non-Greedy quantifiers {n, m}

6. Capture and reverse reference

Capturegroup is like a variable in a regular expression. A capture group can capture the character pattern in a regular expression and reference and modify the pattern by the number or name following the regular expression.

() -- Used to capture strings

/Number -- reference by number

Eg.

Regular Expressions can match strings

(/W)/2/1 Abba

Note: 1. Reverse reference is very effective for matching HTML tags. For example, <(/W +)> </1> can match tags in similar formats such as <Table> </table>.

2. by default, parentheses are used to capture the characters contained in parentheses. You can use the N option to disable this default behavior (details will be given in article 1 ), or add? : To parentheses. Eg .(? : Sophia) or (? N: Sophia) Sophia is not captured at this time.

(? <Capture group name>)/k <capture group name> -- reference by name

Eg.

Regular Expressions can match strings

(? <Sophia>/W) ABC/k <Sophia> xabcx

Note: In the replacement mode, the format of the capture group is slightly different. Capture the group by using a value such as $1 and $2 and reference the capture group by name such as $ {Sophia }.

7. Set regular expression options

Eg.

Stringstr = "<H4> Sophia </H4>"

Regexobjregex = newregex ("<H (d)> (.*?) </H1> ");

Response. Write (objregex. Replace (STR, "<fontsize = $1> $2 </font> "));

I -- the matching executed is case-insensitive (the attribute in. NET is ignorecase)

M -- specify multiline mode (the attribute in. NET is multiline)

N -- only capture groups with names or numbers displayed (the attribute in. NET is explicitcapture)

C -- compile the regular expression, which will produce a fast execution speed, but the startup will slow down (the attribute in. NET is compiled)

S -- specify the singleline mode (the attribute in. NET is singleline)

X -- remove unescaped spaces and comments (the attribute in. NET is ignorepatternwhitespace)

R -- search from right to left (the attribute in. NET is righttoleft)

--- Indicates disabled.

Eg .(? Im-R: Sophia) supports case-insensitive matching of Sophia. The multi-row mode is used, but the matching from right to left is disabled.

Note: 1. M will affect how to parse the starting metacharacters (^) and ending metacharacters ($ ). By default, ^ and $ match only the beginning of the entire string, even if the string contains multiple lines of text. If M is enabled, it can match the beginning and end of each line of text.

2. s will affect how to parse the periods (.). Generally, a period can match all characters except line breaks. However, in single-line mode, a line break can also be matched with a period.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More