Regular Expression Basics

Source: Internet
Author: User
Tags printable characters

The following content comes from Microsoft's official website.

A regular expression describes one or more strings to be matched when searching text bodies. This expression can be used as a template that matches the character pattern with the string to be searched. Regular Expressions include common characters (for example, letters between A and Z) and special characters (called metacharacters ").1, Special characters

The following table contains a list of Single-character metacharacters and their behavior in regular expressions.

Note: To match one of these special characters, you must first escape the character, that is, add a backslash (\) before the character (\). For example, to search for "+" text characters, you can use the expression "\ + ".

Metacharacters

Action

Example

*

Matches the previous character or subexpression zero or multiple times.

Equivalent to {0 ,}.

Zo * matches "Z" and "Zoo.

+

Match the previous character or subexpression one or more times.

Equivalent to {1 ,}.

Zo + matches "zo" and "Zoo", but does not match "Z.

?

Matches the previous character or subexpression zero or once.

It is equivalent to {0, 1 }.

When? Followed by any other delimiters (*, + ,? ,{N},{N,} Or {N,M}), The matching mode is not greedy. The non-Greedy mode matches the searched strings with as few as possible, while the default greedy mode matches the searched strings with as many as possible.

Zo? It matches "Z" and "zo", but does not match "Zoo.

O ++? Only matches a single "O" in "oooo", while O + matches all "O.

Do (ES )? Matches "do" in "do" or "does.

^

Match the start position of the search string. If the flag contains M (multi-line search) characters, ^ matches the position following \ n or \ r.

If ^ is used as the first character in a bracket expression, the character set is reversed.

^ \ D {3} matches the three numbers at the start of the search string.

[^ ABC] matches any character except A, B, and C.

$

Match the position at the end of the search string. If the flag contains M (multi-line search) characters, ^ matches the position before \ n or \ r.

\ D {3} $ matches the three numbers at the end of the search string.

.

Match any single character except linefeed \ n. To match any character including \ n, use a mode such as [\ s.

A. c matches "ABC", "A1c", and "a-c.

[]

Mark the start and end of the parentheses expression.

[1-4] matches "1", "2", "3", or "4. [^ Aaeeiioouu] matches any non-Vowel character.

{}

Mark the start and end of a qualifier expression.

A {2, 3} matches "AA" and "AAA.

()

Mark the start and end of a subexpression. You can save the subexpression for future use.

A (\ D) matches "A0" to "A9. Save the number for future use.

|

Indicates selecting between two or more items.

Z | Food Matches "Z" or "food. (Z | f) matches "zood" or "food.

/

Indicates the start or end of the regular expression mode in JScript. After the second slash (/), add a single character flag to specify the search behavior.

/ABC/Gi is a regular expression of JScript text that matches "ABC. The G (global) Flag specifies all matching items in the search mode. The I (case-insensitive) flag makes the search case insensitive.

\

Mark the next character as a special character, text, reverse reference, or octal escape character.

\ N matches the linefeed. \ (Matches. \ Matches.

When most special characters appear in a bracket expression, they lose their meaning and represent common characters. For more information, see "characters in parentheses expressions" in the matching character list ".2, Metacharacters

The following table contains a list of Multi-character metacharacters and their behavior in regular expressions.

Metacharacters

Action

Example

\ B

Match with a word boundary, that is, the position between the word and the space.

Er \ B matches "er" in "never", but does not match "er" in "verb.

\ B

Non-boundary word match.

Er \ B matches "er" in "verb", but does not match "er" in "never.

\ D

Match numeric characters.

It is equivalent to [0-9].

In the search string "12 345", \ D {2} matches "12" and "34. \ D matches "1", "2", "3", "4", and "5.

\ D

Match non-numeric characters.

It is equivalent to [^ 0-9].

\ D + matches "ABC" and "def" in "ABC123 Def.

\ W

Match any of the following characters: A-Z, a-Z, 0-9, and underline.

It is equivalent to [A-Za-z0-9 _].

Search for the string "The quick brown fox ..." , \ W + matches "the", "quick", "brown", and "Fox.

\ W

Match any character except a-Z, a-Z, 0-9, and underline.

It is equivalent to [^ A-Za-z0-9 _].

Search for the string "The quick brown fox ..." Medium, \ W + and "…" Matches all spaces.

[XYZ]

Character Set. Matches any specified character.

[ABC] matches "A" in "plain.

[^ XYZ]

Reverse character set. Matches any unspecified character.

[^ ABC] matches "P", "L", "I", and "n" in "plain.

[A-Z]

Character range. Matches any character in the specified range.

[A-Z] matches any lowercase letter in the range from "A" to "Z.

[^A-Z]

Reverse character range. Matches any character that is not within the specified range.

[^ A-Z] matches any character that is not in the range of "A" to "Z.

{N}

Exactly matchNTimes.NIt is a non-negative integer.

O {2} does not match "O" in "Bob", but matches two "O" in "food.

{N,}

At least matchNTimes.NIt is a non-negative integer.

* Equal to {0.

+ Is equal to {1.

O {2,} does not match "O" in "Bob", but matches all "O" in "foooood.

{N,M}

Match at leastNTimes, upMTimes.NAndMIs a non-negative integer.N<=M. No space is allowed between commas and numbers.

? Equal to {0, 1.

In the search string "1234567", \ D {123} matches "456", "", and "7.

(Mode)

AndModeMatch and save the match. You canExec MethodThe returned array element is used to retrieve the saved matching items. To match the parentheses (), use "\ (" or "\)".

(Chapter | section) [1-9] matches "Chapter 5" and saves "chapter" for future use.

(? :Mode)

AndModeMatch, but do not save the match; that is, do not store the match for future use. This is useful for components that use the "or" character (|) combination mode.

Industr (? : Y | ies.

(? =Mode)

Positive prediction first. After a match is found, the next match is searched before the match text. No matching items are saved for future use.

^ (? =. * \ D). {} $ apply the following restrictions to the password: It must be between 4 and 8 characters in length and contain at least one number.

In this mode,. * \ D is followed by any number of characters. For the search string "abc3qr", this matches "abc3.

Starting from before (rather than after) the match,. {} matches a string containing 4-8 characters. This matches "abc3qr.

^ And $ specify the start and end positions of the search string. This will block matching when the search string contains any character other than the matching character.

(?!Mode)

Negative prediction first. Matching andModeUnmatched search string. After a match is found, the next match is searched before the match text. No matching items are saved for future use.

\ B (?! Th) \ W + \ B matches words that do not start with "th.

In this mode, \ B matches a word boundary. For the search string "quick", this matches the first space. (?! Th. This matches "Qu.

From this match, \ W + matches a word. This matches "quick.

\ CX

MatchXIndicates the control character.XMust be in the A-Z or a-Z range. If this is not the case, it is assumed that C is the text "C" character itself.

\ Cm matches Ctrl + M or a carriage return.

\ XN

MatchN,NIs a hexadecimal escape code. The hexadecimal escape code must be exactly two digits long. ASCII code can be used in regular expressions.

\ X41 matches ". \ X041 is equivalent to "\ x04" with "1" (becauseNMust be exactly two digits ).

\Num

MatchNum,NumIs a positive integer. This is a reference to saved matches.

(.) \ 1 matches two consecutive identical characters.

\N

Identifies an octal escape code or a reverse reference. If \NAt leastNCapture sub-expressions, thenNIs a reverse reference. Otherwise, ifNIs the eight-digit number (0-7), thenNIt is an octal escape code.

(\ D) \ 1 matches two consecutive identical numbers.

\Nm

Identifies an octal escape code or a reverse reference. If \NmAt leastNmCapture sub-expressions, thenNmIs a reverse reference. If \NmAt leastNCapture subexpressionsNIs reverse reference, followed by textM. If none of the above conditions exists, whenNAndMWhen it is an octal digit (0-7 ,\NmMatch the octal escape codeNm.

\ 11 matches the tab.

\NML

WhenNIt is an octal number (0-3 ),MAndLMatch the octal escape code when it is an octal digit (0-7 ).NML.

\ 011 matches the tab.

\ UN

MatchN, WhereNIt is a Unicode Character in hexadecimal notation.

\ U00a9 and copyright symbol (?) Match.

3. Non-printable characters

The following table contains escape sequences that indicate non-printable characters.

Character

Match

Equivalent

\ F

Page Break.

\ X0c and \ Cl

\ N

Line Break.

\ X0a and \ CJ

\ R

Carriage return.

\ X0d and \ cm

\ S

Any blank characters. It includes spaces, tabs, and page breaks.

[\ F \ n \ r \ t \ v]

\ S

Any non-blank characters.

[^ \ F \ n \ r \ t \ v]

\ T

Tab character.

\ X09 and \ Ci

\ V

Vertical tab.

\ X0b and \ CK

4 , Priority order

The regular expression is calculated in a similar way as an arithmetic expression, that is, it is calculated from left to right and follows the priority order.

The following table lists the priority orders of Regular Expression operators from high to low.

Operator

DescriptionMing

\

Escape Character

(),(? :),(? =), []

Brackets and brackets

*, + ,? ,{N},{N,},{N,M}

Qualifier

^, $ ,\Any metacharacters

Location and Sequence

|

Replace

The character has a priority higher than the replacement operator. For example, allow "M | food" to match "M" or "food ".

Regular Expression Basics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.