Today we have a regular-match problem that suddenly turns to the concept of capturing groups, the manual is also a bit over, Baidu accidentally turned to C # and Java in the regular capture group of special use, search keywords have no relevant content PHP, I tried, found in PHP is also feasible, so summed up, Sharing, but also hope that the great God and careful learners to find my understanding of the problems arising.
What is a capturing group
Capturing group Syntax:
character |
Describe |
Example |
(pattern) |
Matches the pattern and captures the result, setting the group number automatically. |
(ABC) +d Match ABCD or ABCABCD |
(?<name>pattern) Or (?' name'pattern ') |
Matches the pattern and captures the result, setting name to the group name. |
|
\Num |
A reverse reference to the capturing group. Where num is a positive integer. |
(\w) (\w) \2\1 Matching ABBA |
\k< name > Or \k ' name ' |
A reverse reference to a named capture group. Where name is the capturing group name. |
(? <group>\w) abc\k<group> Matching XABCX |
Let's take a look at PHP's regular matching function first.
int Preg_match (string $pattern, String $subject [, Array & $matches [, int $flags = 0 [, int $offset = 0]]]
The previous two items are commonly used, $pattern is a regular match pattern, $string is the string to match.
Array & $match, which is an array of,& representations of the matching results will be written in $match.
int $flags If this token is passed, the string offset (relative to the target string) is appended for each occurrence of the match.
The int $offset is used to specify an unknown start search (in bytes) from the target string.
Let's take a look at what's in $match's value:
$mode = '/a= (\d+) b= (\d+) c= (\d+)/';
$str = ' **a=4b=98c=56** ';
$res =preg_match ($mode, $str, $match);
Var_dump ($match);
The results are as follows:
Array (size=4)
0 => string ' a=4b=98c=56 ' (length=11)
1 => string ' 4 ' (length=1)
2 => string ' No ' (length=2)
3 => string ' (length=2)
Now that we know what a capturing group is, the capturing group is the part of the regular expression that is surrounded by (), and each pair () is a capturing group.
PHP will be numbered for it, starting from 1. As for why it starts at 1, that's because PHP numbers the complete string that matches to 0.
If you have more than one bracket or nested parentheses, number in the order that the left parenthesis appears, as shown in figure:
The 123th number of the capturing group is red, green, and blue, respectively, when matching patterns in the diagram.
Ignore and name of capturing group
We can also block PHP to match the number of groups: in the matching group, before the mode plus?:
$mode = '/a= (\d+) b= (?: \ d+) c= (\d+)/';
In this way, the result of the match becomes:
Array (size=3)
0 => string ' a=4b=98c=56 ' (length=11)
1 => string ' 4 ' (length=1)
2 => string ' (len gth=2)
Of course, we can also give it a unique name within the brackets.
Named subgroups are acceptable (?<name>), (? ' Name ') and (? p<name>) syntax. Previous versions only accept (? p<name>) syntax.
For example: $mode = '/a= (\d+) b= (?) p<sec>\d+) c= (\d+)/';
When used, the result is:
Array (size=5)
0 => string ' a=4b=98c=56 ' (length=11)
1 => string ' 4 ' (length=1)
' sec ' => string ' 98 ' (length=2)
2 => string ' (length=2)
3 => string ' (length=2)
While preserving the indexed array, plus an associated item, the key value is the capturing group name.
Capturing the reverse reference of a group
We can also use \ n or $n to refer to the nth capturing group when we make a regular substitution with the preg_replace () function.
$mode = '/a= (\d+) b= (\d+) c= (\d+)/';
$str = ' **a=4b=98c=56** ';
$RP = ' \1/$2/\3/';
Echo preg_replace ($mode, $RP, $str);//**4/98/56/**
The \1 represents the capturing group 1 (4), the $2 (98) Capture group, and \3 the capturing group 3 (56).
Use of non-capture groups:
Non-capturing group syntax:
character |
Describe |
Example |
(?:pattern) |
Matches pattern but does not capture the result of the match. |
' Industr (?: y|ies) Match ' industry ' or ' industries '. |
(? =pattern) |
0-Width forward check without capturing the matching result. |
' Windows (? =95|98| nt|2000) ' Match Windows in "Windows2000" Does not match "Windows" in "Windows3.1". |
(?! pattern) |
0-width Negative pre-check, no match results captured. |
' Windows (?! 95|98| nt|2000) ' Match Windows in "Windows3.1" Does not match "Windows" in "Windows2000". |
(? <=pattern) |
0-Width forward callback without capturing the matching result. |
' <=office| (? word| Excel) ' Match "2000" in "Office2000" Does not match "2000" in "Windows2000". |
(? <! pattern) |
0-width negative return, do not capture match results. |
' <! (? office| word| Excel) ' Match "2000" in "Windows2000" Does not match "2000" in "Office2000". |
Why is it called a non-capturing group? That's because they have the characteristics of the capturing group, in the matching pattern (), but when they match, PHP doesn't group them, they only affect the matching result, not the result output.
/d (? =xxx) matches "is a number of xxx after".
Note format: Can only be placed after matching pattern strings!
For example:
$pattern = '/\d (=abc)/';
$str = "Ab36abc8eg";
$res =preg_match ($pattern, $str, $match);
Var_dump ($match);//6
Match the 6, because only it as a number, followed by ABC.
(? <=xxx)/D match "preceded by a number of XXX"
Note format: can only be placed before matching pattern strings!
For example:
$pattern = '/(<=ABC) \d/';
$str = "Ab36abc8eg";
$res =preg_match ($pattern, $str, $match);
Var_dump ($match);//8
Match the 8, because only it as a number, followed by ABC.
The opposite of (? =xxx) (? <=xxx) is (?!) =xxx) (? <!=xxx) They added the non-operator "!" before =.
It means that the front/back is not a string of xxx, and here is no longer an example.
If you feel that this blog is helpful to you, you can recommend or pay attention to me, if you have any questions, you can leave a message at the bottom of the discussion, thank you.