The delimiters and metacharacters in the regular expression: 
 
 
 
 
 
Metacharacter is a special character that can represent a character or a type of character. a qualifier is the number of times a character appears.
 
 
 
 
 
 
 
  
   
   | Special characters | Meaning | Example | 
 
   
   | \ | Similar to the escape character of a string, it is also used for escape. | For example, $ is a special character. To match $, $ is required. | 
 
   
   | ^ | Match the start position of a string | ^ A matches Arwen but does not match barwen. | 
 
   
   | $ | Match the end of a string | En $ matches Arwen. But does not match arwenb. | 
 
   
   | * | Match the previous character 0 or multiple times | A * rwen indicates that the letter A appears 0 or multiple times. It can match rwen or aaarwen. | 
 
   
   | + | Match the previous character once or multiple times | A + rwen indicates that the letter A appears once or multiple times. It cannot be 0. It can match Arwen or aarwen, but cannot match rwen. | 
 
   
   | ? | Match the first character 0 or 1 time | A? Rwen indicates that a appears 0 or once. It can match Arwen or rwen, but it cannot match aarwen. | 
 
   
   | {N} | Match the previous character EXACTLY n times, where n is an integer | Ar {2} Wen can match arrwen, but cannot match Arwen or arrrwen. | 
 
   
   | {N, m} | Match the previous character at least N times, at most m times. {N,} indicates at least N matching times. There is no upper limit. | Ar {1, 2} Wen can match Arwen, arrwen. But it does not match awen or arrrwen. | 
 
   
   | . | Point number, matching any single character except line breaks. Use line breaks\ N | Arw. N. can match Arwen, arwin. but cannot match arween or arwn. | 
 
   
   | \ W | Matches characters or numbers, underlines, and Chinese characters |  | 
 
   
   | \ D | Matching number |  | 
 
   
   | \ S | Match any blank characters | It seems that carriage return \ r and linefeed \ n are also counted as blank characters. | 
 
   
   | \ B | Switch or end of a matching word | This must be separated from the switch and end of the character ^, $. We use a string of characters separated by spaces as a word, but the string is a long string of characters. | 
 
   
   |  |  |  | 
 
   
   |  | Note: | Special characters are case-sensitive. All the above characters are in lower case. | 
 
  
 
 
 
 
 
 
 
 
It indicates a backslash character. 
 
The characters mentioned above are all in lower case. Some corresponding upper case characters indicate the opposite meaning.
 
 
 
\ W match is not a letter, number, underline, or Chinese character.
 
 
 
\ S match any character that is not blank
 
 
 
\ D match any character that is not a number
 
 
 
\ B matching is not the start or end position of a word
 
 
 
 
 
Parentheses 
 
Brackets: [], which indicates either of the Characters
 
 
 
For example, [ABCD] + indicates that the four letters of ABCD appear once or multiple times.
 
 
 
 
 
 
 
The combination of parentheses () and | indicates either of them. In fact, they are similar to [], except that [] can only select one letter, and () can select one string.
 
 
 
(ABC | DEF) + indicates that either of the two strings ABC or def appears once or multiple times.
 
 
 
 
 
 
 
The other is ^, which indicates the start of the string, but put it in [] to get the opposite meaning.
 
 
 
So ^ A matches a at the beginning of the character, and [^ A] matches any character not character.
 
 
 
 
 
 
 
Also, in [], use a connector-to indicate a value range of [0-9] to indicate any number ranging from 0 to 9. [A-Z] to indicate one of any lowercase letters.
 
 
 
 
 
 
 
 
 
 
 
So many rules have been mentioned above. How can we use them? In fact, it's very easy to use. What's hard is how do you combine the above rules to implement the conditions you want?
 
 
 
For example, extract the URL from a string.
 
 
 
String STR = "ahttpp: // www.baidu.com/s? Tn = sitehao12 ";// Source character to be extracted
 
 
String Pattern = @ "W {3} \ .. * \. com ";// Here we use the above rules to express the condition. W represents the letter W. If we add \ To \ W, it means different meanings, indicating numbers, letters, underscores, and Chinese characters.
 
 
 
// Then W {3} indicates three consecutive W letters \. indicates the vertex number. note that if there is no @ above, it is represented \\. it is better to remember to add @ to the front of the non-string, so you don't have to think about escaping it again.
 
 
 
//. * Indicates that any character except the line break is 0 or multiple times \. indicates that the dot com indicates the character com
 
 
 
String needstr = RegEx. Match (STR, pattern). value;// You can get the expected result.