A regular expression, also known as a rule expression. It is usually used to retrieve and replace texts that conform to a certain pattern (rule.

Three features of Regular Expressions: 1. Strong flexibility, logic, and functionality; 2. the ability to quickly achieve complex String Control in an extremely simple way. 3. It is difficult for new contacts.

1. metacharacters:

Common metacharacters
Symbol Description
. Match any character except linefeed
\ W Match letters, numbers, underscores, or Chinese Characters
\ S Match any blank space character
\ D Matching number
\ B Start or end of a matching word
^ Start of matching character
$ End of matching character









Example: \ ba \ w * \ B matches a word that starts with the letter a-first at the beginning of a word (\ B), then, then there are any number of letters or numbers (\ w *), and finally the end of the word (\ B ).

\ D + matches one or more consecutive numbers. Here, the "+" is similar to the "*" metacharacters. The difference is that * matches any number of times (which may be 0 times), and "+" matches one or more times.

\ B \ w {6} \ B matches exactly 6 Characters of words. {5, 12} indicates that the number of repetitions cannot be less than 5, but not more than 12. Otherwise, none of them match.

2. escape characters:

If you want to find the metacharacters themselves, for example, if you want to search for. Or *, you may encounter a problem: You cannot specify them because they will be interpreted as other meanings. In this case, you must use \ to cancel the special meanings of these characters. Therefore, you should use \. And \*. Of course, to find the \ itself, you also need to use \\.

For example, baidu \. com matches, C: \ Windows matches C: \ Windows.

3. Qualifier (repeated ):

You have read the above matching methods *, +, {2}, {5, 12. The following are all the qualifiers in the regular expression (a specified number of codes, such as *, {5, 12 ):

Common delimiters
Code Description
* Repeated 0 or more times
+ Repeat once or more times
? Zero or one repetition
{N} Repeated n times
{N ,} Repeat n times or more times
{N, m} Repeat n to m times









4. character class:

To search for numbers, letters, or numbers, the blank space is very simple, because there are already metacharacters corresponding to these character sets, but what should you do if you want to match character sets that do not have predefined metacharacters (such as vowels a, e, I, o, u?

You just need to list them in square brackets. For example, [aeiou] matches any English vowel, [.?!] Match punctuation marks (. Or? Or !).

We can also easily specify a character range. For example, [0-9] indicates that the meaning is exactly the same as \ d: a digit; similarly, [a-z0-9A-Z _] is equivalent to \ w (if only English is considered ).

The following is a more complex expression :\(? 0 \ d {2} [)-]? \ D {8 }.

This expression can match phone numbers in several formats, such as (010) 88886666, 022-22334455, or 02912345678. Let's analyze it. First, it is an escape character \ (it can appear 0 times or once (?), Then there is a 0 followed by two numbers (\ d {2}), followed by one of),-, or space. It appears once or does not appear (?), The last eight digits are (\ d {8 }).

