Usage of the preg_match_all function in php

Source: Internet
Author: User
Tags apache error log html tags

Int preg_match_all (string pattern, string subject, array matches [, int flags])


Search all the content that matches the regular expression given by pattern in the subject and put the result in the matches in the order specified by flags.

After the first match is found, the next search starts from the end of the previous match.

Flags can be a combination of the following tags (note that it is meaningless to combine PREG_PATTERN_ORDER and PREG_SET_ORDER ):

PREG_PATTERN_ORDER

Sort the results to make $ matches [0] an array that matches all modes, $ matches [1] An array consisting of strings that match the child pattern in the first parentheses, and so on.

The code is as follows: Copy code

<? Php
Preg_match_all ("| <[^>] +> (. *) </[^>] +> | U ",
"<B> example: </B> <div align = left> this is a test </div> ",
$ Out, PREG_PATTERN_ORDER );
Print $ out [0] [0]. ",". $ out [0] [1]. "/n ";
Print $ out [1] [0]. ",". $ out [1] [1]. "/n ";
?>

This example will output:

<B> example: </B>, this is a test example:, this is a test

Therefore, $ out [0] contains a string that matches the entire pattern, and $ out [1] contains a string between a pair of HTML tags.


Sort the results to make $ matches [0] an array that matches all modes, $ matches [1] An array consisting of strings that match the child pattern in the first parentheses, and so on. ($ Matches [0] [0] indicates each item in all mode matching, and $ matches [0] [1] indicates the second item in all mode matching, $ matches [1] [0] is used to match the first item in each bracket, and $ matches [1] [0] is used to match the second item in each bracket)

The code is as follows: Copy code

<? Php

 
Preg_match_all ("| <[^>] +> (. *) </[^>] +> | U ",

 
"<B> example: </B> <div align = left> this is a test </div> ",

 
$ Out, PREG_PATTERN_ORDER );

 
Print $ out [0] [0]. ",". $ out [0] [1]. "n ";

 
Print $ out [1] [0]. ",". $ out [1] [1]. "n ";

 
?>

 


This example will output:

 
<B> example: </B>, <div align = left> this is a test </div>

 
Example:, this is a test

 
Therefore, $ out [0] contains a string that matches the entire pattern, and $ out [1] contains a string between a pair of HTML tags.

 
If PREG_SET_ORDER is used


Sort the results so that $ matches [0] is the array of the first set of matching items, $ matches [1] is the array of the second set of matching items, and so on. ($ Matches [0] [0] is the complete matching string in the first matching item, and $ matches [0] [1] is the string in the first matching item in the complete match in the first matching column)

The code is as follows: Copy code

<? Php

 
Preg_match_all ("| <[^>] +> (. *) </[^>] +> | U ",

 
"<B> example: </B> <div align = left> this is a test </div> ",

 
$ Out, PREG_SET_ORDER );

 
Print $ out [0] [0]. ",". $ out [0] [1]. "n ";

 
Print $ out [1] [0]. ",". $ out [1] [1]. "n ";

 
?>

 


This example will output:

 
<B> example: </B>, example:

 
<Div align = left> this is a test </div>, this is a test

 
In this example, $ matches [0] is the first matching result, and $ matches [0] [0] contains the text matching the entire pattern, $ matches [0] [1] contains text matching the first sub-mode, and so on. Similarly, $ matches [1] is the second group of matching results, and so on.

 
PREG_OFFSET_CAPTURE

 
If this flag is set, the offset of the affiliated string is also returned for each matching result. Note that this changes the value of the returned array, so that each unit is also an array. The first item is the matching string, and the second item is its offset in the subject. This tag is available from PHP 4.3.0.

 
If no flag is provided, it is assumed to be PREG_PATTERN_ORDER.

 
Returns the number of matching times (which may be zero) for the entire mode. If an error occurs, FALSE is returned.

 
Example 1. Retrieve all phone numbers from a text

The code is as follows: Copy code

 
<? Php

 
Preg_match_all ("/(? (D )? )? (? (1) [-s]) d-d/x ",

 
"Call 555-1212 or 1-800-555-1212", $ phones );

 
?>

Example 2. Search for matched HTML tags (greedy)

The code is as follows: Copy code

 
<? Php

 
// \ 2 is an example of reverse reference. Its meaning in PCRE is

 
// Match the content in the second set of parentheses in the regular expression itself. In this example

 
// It Is ([w] + ). Because the string is enclosed in double quotation marks

 
// Add a backslash.

 
$ Html = "<B> bold text </B> <a href?howdy.html> click me </a> ";

 
Preg_match_all ("/(<([w] +) [^>] *>) (. *) (</\ 2>)/", $ html, $ matches );

 
For ($ I = 0; $ I <count ($ matches [0]); $ I ++ ){

 
Echo "matched:". $ matches [0] [$ I]. "n ";

 
Echo "part 1:". $ matches [1] [$ I]. "n ";

 
Echo "part 2:". $ matches [3] [$ I]. "n ";

 
Echo "part 3:". $ matches [4] [$ I]. "nn ";

 
}

 
?>

This example will output:

 
Matched: <B> bold text </B>

 
Part 1: <B>

 
Part 2: bold text

 
Part 3: </B>

 
Matched: <a href1_howdy.html> click me </a>

 
Part 1: <a href1_howdy.html>

 
Part 2: click me

 
Part 3: </a>

Example 1. Search for "php" in the text"

The code is as follows: Copy code

<? Php

// After the pattern delimiter, de "I" indicates a non-case-insensitive de-search

If (preg_match ("/php/I", "PHP is the web scripting language of choice .")){

Print "A match was found .";

} Else {

Print "A match was not found .";

    }

?>

Example 2. Search for the word "web"

The code is as follows: Copy code

<? Php

/* In the mode, "de B" indicates the word de boundary. Therefore, only the words you separate "web" will be matched,

* It does not match a part of de in "webbing" or "cobweb */

If (preg_match ("/bwebb/I", "PHP is the web scripting language of choice .")){

Print "A match was found .";

} Else {

Print "A match was not found .";

    }

If (preg_match ("/bwebb/I", "PHP is the website scripting language of choice .")){

Print "A match was found .";

} Else {

Print "A match was not found .";

    }

?>

Example 3. Retrieve the domain name from the URL

The code is as follows: Copy code

<? Php

// Obtain the host name from the URL

Preg_match ("/^ (http ://)? ([^/] +)/I ",

$ Host = $ matches .;

// Obtain the following two segments from the host name

Preg_match ("/[^./] +. [^./] + $/", $ host, $ matches );

Echo "domain name is: {$ matches [0]} n ";

?>

Output:

Domain name is: php.net


Solution to apache restart caused by preg_match_all

For example, preg_match_all ("/ni (.*?) Wo/", $ html, $ matches);) for analysis and matching of a long string $ html (more than 0.1 million bytes, usually used to analyze the source code of the collected webpage ), the Apache server will crash and automatically restart.

The following error message is displayed in the Apache error log:

[Thu Apr 11 18:31:31 2013] [notice] Parent: child process exited with status 128 -- Restarting.
[Thu Apr 11 18:31:31 2013] [notice] Apache/2.2.9 (Win32) PHP/5.2.17 configured -- resuming normal operations
[Thu Apr 11 18:31:31 2013] [notice] Server built: Jun 13 2008 04:04:59
[Thu Apr 11 18:31:31 2013] [notice] Parent: Created child process 2964
[Thu Apr 11 18:31:31 2013] [notice] Disabled use of AcceptEx () WinSock2 API
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Child process is running
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Acquired the start mutex.
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Starting 350 worker threads.
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Listening on port 80.

So how can we increase the ThreadStackSize of the win platform? In the apache configuration file httpd. enable "Include conf/extra/httpd-mpm.conf" in conf (delete previous comment #), then set "ThreadStackSize 8400000" in the mpm_winnt_module configuration module in the httpd-mpm.conf File (about 8 M ).

The code is as follows:

The code is as follows: Copy code
<IfModule mpm_winnt_module>
ThreadStackSize 8400000
ThreadsPerChild 200
MaxRequestsPerChild 10000
Win32DisableAcceptEx
</IfModule>

 

Note that a 32-bit Apache program can only use up to 2 GB of memory space! Therefore, the value of ThreadStackSize multiplied by ThreadsPerChild (8 M * 200) should not exceed 2 GB. Otherwise, apache cannot be started. The error log is as follows:
[Thu Apr 11 20:02:45 2013] [crit] (OS 8) does not have enough storage space to handle this command. : Child 4832: _ beginthreadex failed. Unable to create all worker threads. Created 212 of the 220 threads requested with the ThreadsPerChild configuration directive.
With the above prompt, it is easy to tell you that on my server, when the thread stack size is set to 8 MB, the maximum number of threads that I can set is 212.

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.