Matching sequence of Regular Expressions:
1. Mode Unit
2. Duplicate match? * + {}
3. Boundary Limit ^ $ B B
4. Select mode |
Pattern modifier:
The pattern modifier is marked out of the entire pattern.
I: The characters in the mode match both uppercase and lowercase letters.
M: The string is treated as multiple rows.
S: The string is regarded as a single line, and the linefeed is used as a common character.
X: Ignore the white space in the mode.
A: It is mandatory to start matching only from the beginning of the target string.
D: The dollar character in the mode matches only the end of the target string.
U: match the nearest string.
Pattern modifier in PHP and Regular Expressions
The following lists the delimiters that may be used in pcre. The inner PCRE names of these modifiers are in brackets.
I (PCRE_CASELESS)
If this modifier is set, characters in the mode match both uppercase and lowercase letters.
M (PCRE_MULTILINE)
By default, PCRE uses the target string as a single "line" character (or even contains a line break ). The line start metacharacters (^) only match the start of the string, and the line end metacharacters ($) only match the end of the string, or the last character is before the line break (unless the D modifier is set ). This is the same as Perl.
When this modifier is set, "row start" and "Row end" not only match the start and end of the entire string, but also match the end and end of the linefeed respectively. This is equivalent to the/m modifier of Perl. If the target string does not contain the \ n character or the mode does not contain ^ or $, this modifier is set to no effect.
S (PCRE_DOTALL)
If this modifier is set, the dot metacharacters (.) In the pattern match all characters, including line breaks. If this parameter is not set, line breaks are not included. This is equivalent to the/s modifier of Perl. For example, [^ a] Always matches a line break, regardless of whether this modifier is set.
X (PCRE_EXTENDED)
If this modifier is set, the white space characters in the mode are ignored except for escaped characters or in the character class, all the characters between the # And the next line break except the unescaped character class, including both ends, are ignored. This is equivalent to the/x modifier of Perl, so that annotations can be added in complex modes. However, note that this only applies to data characters. A blank character may never appear in a special character sequence in a pattern, for example, a sequence that introduces a condition subpattern (? (Middle.
E
If this modifier is set, preg_replace () replaces the reverse reference in the replacement string as a normal replacement, evaluate it as the PHP code, and use the result to replace the searched string.
This modifier is only used by preg_replace (), which is ignored by other PCRE functions.
Note: This modifier is unavailable in PHP3.
A (PCRE_ANCHORED)
If this modifier is set, the pattern is forced to "anchored", that is, it is forced to match only from the beginning of the target string. This effect can also be achieved through the appropriate mode itself (the only method implemented in Perl ).
D (PCRE_DOLLAR_ENDONLY)
If this modifier is set, the dollar character in the pattern matches only the end of the target string. Without this option, if the last character is a line break, the dollar sign will also match before this character (but not before any other line breaks ). If the m modifier is set, ignore this option. Perl does not have an equivalent modifier.
S
When a mode is used several times, it is worth analyzing for acceleration matching. If this modifier is set, additional analysis is performed. Currently, the analysis mode is only useful for non-anchored modes without a single fixed start character.
U (PCRE_UNGREEDY)
This modifier reverses the value of the matching quantity so that it is not the default repetition, but becomes followed by "?" . This is incompatible with Perl. You can also set in the mode (? U) to enable this option.
X (PCRE_EXTRA)
This modifier enables additional features that are not compatible with Perl in a pcre. Any backslash followed by a letter with no special meaning in the pattern causes an error, so that this combination is retained for future expansion. By default, like Perl, a backslash followed by a letter without special meaning is treated as the letter itself. No other features are currently controlled by this modifier.
U (PCRE_UTF8)
This modifier enables additional features that are not compatible with Perl in a pcre. The pattern string is treated as a UTF-8. This modifier is available in Unix from PHP 4.1.0 and win32 from PHP 4.2.3.
About reverse reference
Reverse reference is related to the submode.
// ()/Is the delimiter used for the submode. The number 1 ~ is automatically assigned to the matched submode ~ 9. \ 99 makes the interpreter confused. However, you can use /(?) /(Another method /(? 'Word')/) to name the sub-mode, and use \ k (remember to have a k) for reverse reference, in this way, there will be no limit on the number. (to be accurate, the maximum value is 99. Otherwise, it seems unnecessary ).
/(? : T1 | t2 |)/this "? : "Is used to tell the interpreter not to assign an automatic number for this sub-mode. Therefore, the interpreter jumps to T2.
/\ B (? = Ing \ B)/this reference indicates matching the content before the expression. If the example is used to match I'm singing while you are dancing, the result is sing and danc, because \ B is used.
/(? <= Ing)/matches the content after ing.