Crawled content in the regular expression through the filter to get the content you want, as to how to use regular expression filter, here do not introduce, interested in, the following is a few commonly used in PHP crawl content of the Web page method.
1.file_get_contents
PHP code
Copy CodeThe code is as follows:
$url = "Http://www.jb51.net";
$contents = file_get_contents ($url);
If there is a garbled Chinese use the following code
$getcontent = Iconv ("gb2312", "Utf-8", $contents);
Echo $contents;
?>
2.curl
PHP code
Copy CodeThe code is as follows:
$url = "Http://www.jb51.net";
$ch = Curl_init ();
$timeout = 5;
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_connecttimeout, $timeout);
Add the following two lines to the Web page that requires user detection
curl_setopt ($ch, Curlopt_httpauth, Curlauth_any);
curl_setopt ($ch, Curlopt_userpwd, Us_name. ":". US_PWD);
$contents = curl_exec ($ch);
Curl_close ($ch);
Echo $contents;
?>
3.fopen->fread->fclose
PHP code
Copy CodeThe code is as follows:
$handle = fopen ("Http://www.jb51.net", "RB");
$contents = "";
do {
$data = Fread ($handle, 1024);
if (strlen ($data) = = 0) {
Break
}
$contents. = $data;
} while (true);
Fclose ($handle);
Echo $contents;
?>
Note:
1. Use file_get_contents and fopen to open the Allow_url_fopen. Method: Edit PHP.ini, set allow_url_fopen = On,allow_url_fopen Close when fopen and file_get_contents cannot open remote files.
2. Use curl to have space to turn on curl. Method: Modify PHP.ini under WINDOWS, remove the semicolon in front of Extension=php_curl.dll, and need to copy Ssleay32.dll and Libeay32.dll to C:\WINDOWS\system32 ; Install the curl extension under Linux.
http://www.bkjia.com/PHPjc/319720.html www.bkjia.com true http://www.bkjia.com/PHPjc/319720.html techarticle crawled content in the regular expression through the filter to get the content you want, as to how to use regular expression filter, here do not introduce, interested, the following ...