Capture group and non-capture group using PHP Regular Expressions

Source: Internet
Author: User
This article mainly introduces the capture group and non-capture group of PHP regular expressions. For more information, see

This article mainly introduces the capture group and non-capture group of PHP regular expressions. For more information, see

Today, I encountered a regular expression matching problem and suddenly turned to the concept of a capturing group. The manual was a little over, baidu has no intention of turning to the special usage of Regular Expression capturing groups in C # And Java. When the search keyword is PHP, there is no relevant content. I tried it myself and found that it is also feasible in PHP, to sum up, I also hope that students with great knowledge and attention can find problems in my understanding.

What is a capture group?

Capture group Syntax:

Character

Description

Example

(Pattern)

Match pattern and capture the result. The group number is automatically set.

(Abc) + d

Match abcd or abcabcd

(? Pattern)

Or

(? 'Name' pattern)

Match pattern and capture the result. Set name to the group name.

\ Num

Reverse reference of the capture group. Num is a positive integer.

(\ W) \ 2 \ 1

Match abba

\ K <name>

Or

\ K'name'

Reverse reference of the named capture group. The name is the name of the capture group.

(? \ W) abc \ k

Match xabcx

Let's take a look at PHP's regular expression matching function.

Int preg_match (string $ pattern, string $ subject [, array & $ matches [, int $ flags = 0 [, int $ offset = 0])

The first two items are commonly used. $ pattern is the regular expression matching mode, and $ string is the string to be matched.

Array & $ match, which is an array, & indicates that the matched results will be written into $ match.

If int $ flags passes this flag, a string offset (relative to the target string) will be appended to each occurrence of the matching result ).

Int $ offset is used to specify to start searching from an unknown target string (in bytes ).

Let's take a look at the values of $ match:

$ Mode = '/a = (\ d +) B = (\ d +) c = (\ d + )/'; $ str = '** a = 4b = 98c = 56 **'; $ res = preg_match ($ mode, $ str, $ match); var_dump ($ match );

The result is as follows:

Array (size = 4)
0 => string 'a = 4b = 98c = 56' (length = 11)
1 => string '4' (length = 1)
2 => string '98 '(length = 2)
3 => string '56' (length = 2)

Now we know what a capturing group is. A capturing group is a part enclosed by () in a regular expression. Each pair () is a capturing group.

PHP will number it, starting from 1. As for why it starts from 1, PHP numbers the matched complete string to 0.

If there are multiple parentheses or nested parentheses, numbers are numbered according to the sequence in which the left parentheses appear,

When matching in the matching mode shown in the figure, the number 123 of the capture group is red, green, and blue.

Ignore and name a capture group

We can also prevent PHP from adding the matching group number before the matching group mode? :

$ Mode = '/a = (\ d +) B = (? : \ D +) c = (\ d + )/';

In this way, the matching result will become:

Array (size = 3) 0 => string 'a = 4b = 98c = 56' (length = 11) 1 => string '4' (length = 1) 2 => string '56' (length = 2)

Of course, we can also give it a unique name inside the brackets.

The naming sub-group is acceptable (? ),(? 'Name') and (? P ) Syntax. Earlier versions only accept (? P ) Syntax.

For example: $ mode = '/a = (\ d +) B = (? P \ D +) c = (\ d + )/';

The result is as follows:

Array (size = 5) 0 => string 'a = 4b = 98c = 56' (length = 11) 1 => string '4' (length = 1) 'sec '=> string '98' (length = 2) 2 => string '98 '(length = 2) 3 => string '56' (length = 2)

Add an association item while retaining the index array. The key value is the name of the capture group.

Reverse reference of a capture group

When we use the preg_replace () function for regular expression replacement, we can also use \ n or $ n to reference the nth capture group.

$ Mode = '/a = (\ d +) B = (\ d +) c = (\ d + )/'; $ str = '** a = 4b = 98c = 56 **'; $ rp = '\ 1/$2/\ 3/'; echo preg_replace ($ mode, $ rp, $ str); // ** 4/98/56 /**

\ 1 indicates capture group 1 (4), $2 is capture group 2 (98), and \ 3 is capture group 3 (56 ).

Non-capturing group usage:

Non-capturing group Syntax:

Character

Description

Example

(? : Pattern)

Matches pattern, but does not capture matching results.

'Industr (? : Y | ies)

Match 'cluster' or 'industries '.

(? = Pattern)

Pre-check with zero width, and no matching results are captured.

'Windows (? = 95 | 98 | NT | 2000 )'

Match "Windows" in "Windows2000"

Does not match "Windows" in "Windows3.1 ".

(?! Pattern)

Pre-query with Zero Width and negative value without capturing matching results.

'Windows (?! 95 | 98 | NT | 2000 )'

Match "Windows" in "Windows3.1"

Does not match "Windows" in "Windows2000 ".

(? <= Pattern)

The zero-width forward lookup does not capture matching results.

'1970 (? <= Office | Word | Excel )'

Match "2000" in "Office2000"

Does not match "2000" in "Windows2000 ".

(?

The matching results are not captured.

'1970 (?

Match "2000" in "Windows2000"

The parameter does not match "2000" in "Office2000 ".

Why is it a non-capturing group? That's because they have the capture group feature. In the matching mode (), but in the matching mode, PHP will not group them. They will only affect the matching results and will not serve as the result output.

/D (? = Xxx) Match "followed by a number of xxx ".

Note: The format can only be placed after the matching mode string!

For example:

$ Pattern = '/\ d (? = Abc)/'; $ str = "ab36abc8eg"; $ res = preg_match ($ pattern, $ str, $ match); var_dump ($ match); // 6

Match 6 because it is only used as a number, followed by abc.

(? <= Xxx)/d match "a number in front of xxx"

Note: The format can only be placed before the matching mode string!

For example:

$ Pattern = '/(? <= Abc) \ d/'; $ str = "ab36abc8eg"; $ res = preg_match ($ pattern, $ str, $ match); var_dump ($ match); // 8

Match 8 because it is only used as a number, followed by abc.

And (? = Xxx )(? <= Xxx) is relative (?! = Xxx )(?

It indicates that the Front/end is not a string of xxx, so we will not give an example here.

If you think this blog is helpful to you, you can recommend or follow me. If you have any questions, leave a message below to discuss them. Thank you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.