1. \ B: indicates the start or end of a word. It may be a space, punctuation, or line feed, but \ B does not match any of them. This refers to any position in these elements.
Example: \ bhi \ B: search for all the "hi" words in the text, excluding the words "him" and "history ".
1.1 ^: match the start of a string, specifically the start of a paragraph.
1.2 $: End of matching string. It refers to the end of a paragraph, both of which are a subset of \ B.
Repeat:
2. *: indicates that * the preceding content is repeatedly displayed multiple times. ". *" is connected together to indicate that any number of characters do not contain line breaks.
Example: \ bhi \ B. * \ bLucy \ B: first a hi, then any number of characters (but cannot have a carriage return), and finally a separate word Lucy.
2.1 +: The number is the same, but the number + must be 1 or more, excluding 0, and * indicates any number, including 0 duplicates.
2.2 {n}: Number Control. The preceding characters are exactly repeated n times.
2.3 {n, m}: Number Control. The characters in front of the repeat n to m times, and n <= m.
2.4? : Repeated 0 times or 1 time.
3 ..: represents any character, does not contain carriage return line breaks.
4. \ d: match any number (0, 1, 2 ...... 9)
Example: 0 \ d-\ d {7}: search for a string starting with 0, followed by a number, followed by a hyphen "-", followed by a string of 7 digits, for example, 025-8224110.
5. \ s: matches any blank space, including spaces, tabs, line breaks, and Chinese fullwidth spaces.
6. \ w: matching letters, numbers, and lower strokes.
Example 1: \ ba \ w * \ B: match with a blank character starting with the letter "a", followed by any number of arbitrary characters, excluding spaces and other blank characters, and then a word Terminator. It indicates all words starting with.
Example 2: \ B \ w {6} \ B: match a word with exactly 6 characters.
7. []: match any character in square brackets.
Example: [abc] \ w {4} \ B: Start with any character in a, B, and c, followed by four letters.
Antsense
8. The uppercase form of metacharacters \ W \ B respectively indicates the negative sense of the set they represent.
Example: \ D: Indicates not all characters of a number, for example, abced
8.1 [^ x]: not all characters of the x character
8.2 [^ xyz]: it is not any character in x, y, or z.
9. Replace
"|": The "|" symbol can be used to implement logic or operations. It can be used with parentheses "()" to implement or operations on different conditions.
10 groups
"()": Enclose the implemented expressions with parentheses to facilitate repeated and replacement operations.
Example: \ B (\ w + \ B \ s +) \ 1 + \ B: represents the first appearance of the bracket expression with \ 1, which can match go
It is already very good to learn it here. Next we will continue to study the advanced attributes of regular expressions.
Assertions:
(? = Express) This is a hypothetical condition, which can be placed behind the expression. It has been verified whether the expression following the character in front is express, but does not contain the express at the rear.
Example: \ B \ w *(? = Ing \ B): gets the prefix of all words suffixed with ing.
(? <= Express) predicate, placed in the expression header, has verified whether the expression in front of the string conforms to express, and does not include express itself.
Example :(? <= \ Bre) \ w * \ B: Obtain the back part of all words prefixed with re.
Note:
(? #) Comment the regular expression in this form.
Example: 2 [0-4] \ d (? #200-249)
Lazy pattern matching
*: The most matched characters
*? : Match the least characters