Use metacharacters
Matching number:
Match letters and numbers
\ W |
[0-9a-zA-Z _] Note: underlines are included. |
\ W |
[^ 0-9a-zA-Z _] |
Match blank characters
\ S |
Any blank character [\ f \ n \ r \ t \ v] |
\ S |
Any non-blank character [^ \ f \ n \ r \ v] |
[\ B] |
Matching backspace characters is a special case |
To match a hexadecimal value, use the \ x prefix and the numbers \ x0A to match \ n.
Octal to add numbers with the \ 0 prefix
Example:
? Matched Email: [\ w \.] + @ [\ w \.] + \. \ w +, but not the best, incomplete
? The first letter in the mailbox must be a letter or underline, so the correct mode is:
\ W * [\ w.] + @ [\ w.] + \. \ w +
Note: Used in., +, *, and other metacharacters are automatically interpreted as common characters. Therefore, [\ w.] and [\ w \.] is the same.
Duplicate match
Match one or more +
Match zero or multiple *
Matches zero or one?
Matching repeat times: {n}, {n, m}, {n,}, at least n, {, m}, and up to m (including n, m]
Example:
? Match a webpage address: https? : // [\ W./] + or http [s]? : // [\ W./] +
? Match blank lines (in windows, the space is \ n \ r, and in Linux/Unix, the space is \ n). Separately match the blank lines in their respective systems: windows-'\ r \ n \ r \ n' Linux/Unix-' \ n \ n'; therefore, a mode can be used to indicate [\ r]? \ N [\ r]? \ N
? Check whether the date format is correct (the date value should be checked before this ):
\ D {1, 2} [-\/] \ d {1, 2} [-\/] \ d {2, 4}
Prevent over-matching: + and * are both greedy and will match the most. To use the metacharacters of the lazy version, you only need to add the greedy metacharacters? Suffix.
+? ,*? , {N ,}? Is the lazy metacharacter version corresponding to their greedy metacharacters.
? Match the content in the <B> </B> tag in HTML
<B> AK </B> and <B> HI </B>
Mode 1: [<B/>]. * [</B>] Excessive matching
Mode 2: [<B/>]. *? [</B>]
Location match
Metacharacters |
Description |
\ Bword \ B |
Match the word boundary. The length of the matching word is 4. Only the word itself is matched. His cap and cape from capsized Example: \ Bcap \ B will match the cap \ Bcap will match any word starting with cap Cap \ B matches any word ending with cap |
\ B |
Does not match word boundary, Colors and-coded pass-key Example: \ B-\ B will match a hyphen pass-key that is not the word boundary before and after |
Start of a string: ^ End of the string: $ Branch matching mode :? M (multiline mode) |
After the line separator is used, ^ matches the start position after the line separator (line break) in addition to the start position of the string, and $ matches the end position after the line separator. (? M) must appear before the entire mode Example: Determine whether a file is an xml file: ^ \ S <\? Xml. * \?> </Html> the tag should not be followed by other content. Check the tag validity: </[Hh] [Tt] [Mm] [Ll]> \ s * $ Match all comments in JavaScript: (? M) ^ \ s * //. * $ |
Use a subexpression
Metacharacters and characters are basic components of regular expressions. The subexpression is nested with values,
Example
? Use & nbsp (None-breaking space) in html to indicate non-line feed spaces. One or more & nbsp; must be matched, but use the '& nbsp; {2 ,} 'is not correct, and the whole needs to be matched. Therefore, brackets () are used to enclose it as an independent element. This mode is a subexpression, which is usually enclosed by parentheses. (& Nbsp;) {2 ,}
? IP address format:
Mode 1: \ d {1, 3}. \ d {1, 3}. \ d {1, 3}. \ d {1, 3}
Mode 2 :( \ d {1, 3}.) {3} \ d {1, 3}
? Print the year number in the user record
Log format:
ID: 042
SEX: M
DOB: 1996-08-17
Status: Active
Matching mode: (19 | 20) \ d {2}
? Use the nesting of subexpressions to match the IP address format and the valid IP address. When constructing a regular expression, you must be clear about what to match, unlike what to match:
Matched IP address: any one or two digits, any three digits starting with 1, any one starting with 2, and the second digit starting with 0 ~ The 3-digit number of 4. Any digit starting with 25 has a value ranging from 0 to 3rd ~ 3 Between 5 is a number
Mode: (\ d {1}) | {1 \ d {2}) | (2 [0-4] \ d) (25 [0-5]) \.) {3} (\ d {1, 2}) | (1 \ d {2}) | (2 [0-4) | (25 [0-5])
Basic knowledge about regular expressions 01