Examples are as follows.
Suppose your text contains nested parentheses with the correct pairing. The depth of the bracket can be an infinite layer. You want to capture such a bracket group.
Forgive me for spoilers, the standard answer is this:
| The code is as follows |
Copy Code |
<?php $string = "Some text (a (b (c) d) e) more text"; if (Preg_match ()/([^ ()]+| (?) R)) *)/", $string, $matches)) { echo "<pre>"; Print_r ($matches); echo "</pre>"; } ?> Its output results are: Array ( [0] => (A (b (c) d) e) [1] => E )
|
As we can see, the text we need has been captured in $matches[0].
Principle
Now think about the principle.
The key point in the regular expression above is (?) R). (? R) is the function of recursively replacing the entire regular expression in which it resides. Each iteration, the PHP parser will (? R) is replaced by "([^ ()]+| (? R)) ".
Thus, the regular expression of the above example is equivalent to:
| The code is as follows |
Copy Code |
"/(([^()]+| (([^()]+| (([^()]+)*)) *))*)/" |
However, the above code is only suitable for brackets with a depth of 3 layers. For parentheses nesting of unknown depths, you have to use this regular:
| The code is as follows |
Copy Code |
"/(([^()]+| (? R))/" |
It can not only match the infinite depth, but also simplifies the syntax of regular expressions. Powerful, concise grammar.
Now take a closer look "/([^ ()]+| ()]. R))/"How to Match" (A (b (c) d) e):
1. The section "(c)" is matched by a regular "([^ ()]+) *]". Note that (c) is actually equivalent to a miniature of the entire recursion, though small spite, so it uses the entire regular expression.
In other words, (c) in the next step, you can use the (? R) to match.
2. The matching process for (b (c) d) is:
1. "(" Matching ") (";
2. "[^ ()]+" matches "B";
3. (? R) Match "(c)";
4. "[^ ()]+" matches "D";
5. ")".
Based on the above matching principle, it is not difficult to understand why the 2nd element of an array $matches[1] is equivalent to ' e '. Substring ' e ' is captured in the last matching iteration. Only the last captured result is saved to the array during the match.
Rex Note: For this feature, you can try it by yourself and see if you use a regular formula ([a-z]+[0-9]+) + to match the string abc123xyz890 and what the capture result is. Note that the results are not in conflict with the left longest principle.
If we only need to capture $matches [0], we can do this:
| The code is as follows |
Copy Code |
<?php $string = "Some text (a (b (c) d) e) more text"; if (Preg_match ("/(): [^ ()]+| (?) R)) *)/", $string, $matches)) { echo "<pre>"; Print_r ($matches); echo "</pre>"; } ?> produce the same result: Array ( [0] => (A (b (c) d) e) )
|
The change is to capture the parentheses () instead of capturing the capture bracket (?:) Out.
Can also be further improved to:
| The code is as follows |
Copy Code |
<?php $string = "Some text (a (b (c) d) e) more text"; if (Preg_match () ()]+| (? >[^ (). R)) *)/", $string, $matches)) { echo "<pre>"; Print_r ($matches); echo "</pre>"; } ?>
|
Here we use the so-called one-off mode (Rex Note: Yu Yu, "proficient in regular Expression v3.0", referred to as "solidification Group". Refer to the book.) The PHP manual also recommends that you use this pattern whenever possible, so that you can elevate the speed of regular expressions.