C # Regular Expression (2): Common special characters (metacharacters, character limitation)

Source: Internet
Author: User
The delimiters and metacharacters in the regular expression:

 

Metacharacter is a special character that can represent a character or a type of character. a qualifier is the number of times a character appears.

 

Special characters Meaning Example
\

Similar to the escape character of a string, it is also used for escape.

For example, $ is a special character. To match $, $ is required.
^ Match the start position of a string ^ A matches Arwen but does not match barwen.
$ Match the end of a string En $ matches Arwen. But does not match arwenb.
* Match the previous character 0 or multiple times A * rwen indicates that the letter A appears 0 or multiple times. It can match rwen or aaarwen.
+ Match the previous character once or multiple times A + rwen indicates that the letter A appears once or multiple times. It cannot be 0. It can match Arwen or aarwen, but cannot match rwen.
? Match the first character 0 or 1 time A? Rwen indicates that a appears 0 or once. It can match Arwen or rwen, but it cannot match aarwen.
{N} Match the previous character EXACTLY n times, where n is an integer Ar {2} Wen can match arrwen, but cannot match Arwen or arrrwen.
{N, m}

Match the previous character at least N times, at most m times.

{N,} indicates at least N matching times. There is no upper limit.

Ar {1, 2} Wen can match Arwen, arrwen. But it does not match awen or arrrwen.
.

Point number, matching any single character except line breaks. Use line breaks\ N

Arw. N. can match Arwen, arwin. but cannot match arween or arwn.
\ W Matches characters or numbers, underlines, and Chinese characters  
\ D Matching number  
\ S Match any blank characters It seems that carriage return \ r and linefeed \ n are also counted as blank characters.
\ B Switch or end of a matching word This must be separated from the switch and end of the character ^, $. We use a string of characters separated by spaces as a word, but the string is a long string of characters.
     
  Note:

Special characters are case-sensitive. All the above characters are in lower case.

 

 

It indicates a backslash character.

The characters mentioned above are all in lower case. Some corresponding upper case characters indicate the opposite meaning.

\ W match is not a letter, number, underline, or Chinese character.

\ S match any character that is not blank

\ D match any character that is not a number

\ B matching is not the start or end position of a word

 

Parentheses

Brackets: [], which indicates either of the Characters

For example, [ABCD] + indicates that the four letters of ABCD appear once or multiple times.

 

The combination of parentheses () and | indicates either of them. In fact, they are similar to [], except that [] can only select one letter, and () can select one string.

(ABC | DEF) + indicates that either of the two strings ABC or def appears once or multiple times.

 

The other is ^, which indicates the start of the string, but put it in [] to get the opposite meaning.

So ^ A matches a at the beginning of the character, and [^ A] matches any character not character.

 

Also, in [], use a connector-to indicate a value range of [0-9] to indicate any number ranging from 0 to 9. [A-Z] to indicate one of any lowercase letters.

 

 

So many rules have been mentioned above. How can we use them? In fact, it's very easy to use. What's hard is how do you combine the above rules to implement the conditions you want?

For example, extract the URL from a string.

String STR = "ahttpp: // www.baidu.com/s? Tn = sitehao12 ";// Source character to be extracted

String Pattern = @ "W {3} \ .. * \. com ";// Here we use the above rules to express the condition. W represents the letter W. If we add \ To \ W, it means different meanings, indicating numbers, letters, underscores, and Chinese characters.

// Then W {3} indicates three consecutive W letters \. indicates the vertex number. note that if there is no @ above, it is represented \\. it is better to remember to add @ to the front of the non-string, so you don't have to think about escaping it again.

//. * Indicates that any character except the line break is 0 or multiple times \. indicates that the dot com indicates the character com

String needstr = RegEx. Match (STR, pattern). value;// You can get the expected result.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.