Subgroups are delimited by parentheses, and they can be nested. To mark a part of a pattern as a subgroup (sub-mode) The main thing is to do two things:
Localization of optional branches. For example, Mode Cat (arcat|erpillar|) Match "Cat", "cataract", and "caterpillar", if there is no parentheses, it matches "cataract", "erpillar" and an empty string.
Set the subgroup to capture subgroups (defined above). When the entire pattern is matched, the portion of the matching subgroup in the target string is passed back to the caller via the Ovector parameter of Pcre_exec () (). The order in which the left parenthesis appears from left to right is the subscript of the corresponding subgroup (starting at 1), which can be used to obtain the capture sub-pattern match result.
For example, if the string "The Red King" is used to match the pattern ((red|white) (King|queen)), the result of the pattern matching is the form of array ("Red King", "Red King", "Red", "King"), its The No. 0 element in the list is the result of the entire pattern match, followed by three elements, followed by the result of a three subgroup match. The following tables are 1, 2, 3, respectively.
In fact, the two functions that parentheses perform are not always useful. Often we have a need to group with subgroups, but not to capture them (individually). The string "?:" immediately following the left parenthesis defined by the subgroup causes the subgroup to not be captured separately and does not affect the calculation of the subsequent subgroup ordinal. For example, if the string "The White Queen" matches the pattern ((?: Red|white) (King|queen)), the matching result would be an array ("White Queen", "White Queen", and "White Queen"), the And King|queen These two sub-groups. The maximum number of capturing sub-group ordinals is 99, and the maximum allowed for all subgroups (both captured and non-captured) is 200.
For easy shorthand, if you need to set options at the start of a non-capturing subgroup, the option letter can be located? And: Between, for example:
(? i:saturday|sunday) (?:(? i) saturday|sunday)
The above two formulations are actually the same pattern. Because the optional branch tries each branch from left to right, and the option is not reset before the end of the sub-mode, and because the options are set to penetrate through the other branches later, the above pattern will match "SUNDAY" and "Saturday".
In PHP 4.3.3, you can use a child group (? P<name>pattern) is named after the syntax. This sub-pattern will appear in the matching results at the same time in its name and order (digital subscript), PHP 5.2.2 added two flavors subgroup naming syntax: (? <name>pattern) and (? ') Name ' pattern ').
Sometimes multiple matches are required to select subgroups in a regular expression. In order for multiple subgroups to be able to share a back-reference number problem, the (? | syntax allows the number to be copied. Consider the following regular expression matching Sunday:
(?:( Sat) ur| (Sun)) Day
Here, when the back reference is 1 null, Sun is stored in the back reference 2. When the back reference 2 does not exist, the Sat is stored in the back reference 1. Use (? | Modify the mode to fix the problem:
(?| (Sat) ur| (Sun)) Day
With this pattern, sun and the SAT are stored in the back reference 1.