Match the regular expression of the webpage content. I want to match all the URLs at www.425sf.com with PHPcode $ url & quot; www.425sf.com & quot; Collection Address $ contentfile_get_contents ($ url ); $ patten & quot; ^ (https | http | ftp | r matches the regular expression of the URL of the webpage content
I hope to match all the network regions of this website http://www.425sf.com/
PHP code
$ Url = "http://www.425sf.com/"; // Collection Address $ content = file_get_contents ($ url); $ patten = "^ (https | http | ftp | rtsp | mms )? ://)? ([0-9a-z _!~ * '(). & =+ $ %-] + :)? [0-9a-z _!~ * '(). & =+ $ %-] + @)? ([0-9] {1, 3} \.) {3} [0-9] {1, 3} | ([0-9a-z _!~ * '()-] + \.) * ([0-9a-z] [0-9a-z-] {0, 61 })? [0-9a-z] \. [a-z] {2, 6}) (: [0-9] {1, 4 })? ((/?) | (/[0-9a-z _!~ *'().;? : @ & =+ $, % #-] +) + /?) $ "; Preg_match_all ($ patten, $ content, $ matches );
For the above matching regular expressions, I refer to here
Http://topic.csdn.net/u/20070307/14/87e6b878-800e-4a88-830e-7d0eeeaba891.html
I tried the regular expression test tool more accurately.
However, it seems that php cannot be obtained.
------ Solution --------------------
PHP code
$ Html = <
Http://www.baidu.com [1] => http://hi.baidu.com? Info = aaa )*/
------ Solution --------------------
LS positive solution:
Preg_match_all
(PHP 4, PHP 5)
Preg_match_all-perform global regular expression matching
Description
Int preg_match_all (string $ pattern, string $ subject, array $ matches [, int $ flags])
Search all the content that matches the regular expression given by pattern in the subject and put the result in the matches in the order specified by flags.
After the first match is found, the next search starts from the end of the previous match.
Flags can be a combination of the following tags (note that it is meaningless to combine PREG_PATTERN_ORDER and PREG_SET_ORDER ):
PREG_PATTERN_ORDER
Sort the results to make $ matches [0] an array that matches all modes, $ matches [1] An array consisting of strings that match the child pattern in the first parentheses, and so on.
Preg_match_all ("| <[^>] +> (.*)
] +> | U ",
"
Example:
This is a test
",
$ Out, PREG_PATTERN_ORDER );
Print $ out [0] [0]. ",". $ out [0] [1]. "\ n ";
Print $ out [1] [0]. ",". $ out [1] [1]. "\ n ";
?>
This example will output:
Example:,
This is a test
Example:, this is a test
Therefore, $ out [0] contains a string that matches the entire pattern, and $ out [1] contains a string between a pair of HTML tags.
PREG_SET_ORDER
Sort the results so that $ matches [0] is the array of the first set of matching items, $ matches [1] is the array of the second set of matching items, and so on.
Preg_match_all ("| <[^>] +> (.*)
] +> | U ",
"
Example:
This is a test
",
$ Out, PREG_SET_ORDER );
Print $ out [0] [0]. ",". $ out [0] [1]. "\ n ";
Print $ out [1] [0]. ",". $ out [1] [1]. "\ n ";
?>
This example will output:
Example:, Example:
This is a test
, This is a test
In this example, $ matches [0] is the first matching result, and $ matches [0] [0] contains the text matching the entire pattern, $ matches [0] [1] contains text matching the first sub-mode, and so on. Similarly, $ matches [1] is the second group of matching results, and so on.
PREG_OFFSET_CAPTURE
If this flag is set, the offset of the affiliated string is also returned for each matching result. Note that this changes the value of the returned array, so that each unit is also an array. The first item is the matching string, and the second item is its offset in the subject. This tag is available from PHP 4.3.0.
If no flag is provided, it is assumed to be PREG_PATTERN_ORDER.
Returns the number of matching times (which may be zero) for the entire mode. If an error occurs, FALSE is returned.
Example #1 obtain all phone numbers from a text
Preg_match_all ("/\(? (\ D {3 })? \)? (? (1) [\-\ s]) \ d {3}-\ d {4}/x ",
"Call 555-1212 or 1-800-555-1212", $ phones );
?>
Example #2 search for matched HTML tags (greedy)
// \ 2 is an example of reverse reference. its meaning in PCRE is
// Match the content in the second set of parentheses in the regular expression itself. In this example
// It Is ([\ w] + ). Because the string is enclosed in double quotation marks
// Add a backslash.
$ Html ="
Bold textClick me ";