function Get_url_content ($Url, $Method = c) {
Introduce the required language encoding. If not, it will default to Utf-8, don't worry.
Global $Charset;
$Urlarr = Parse_url ($URL);
If the domain name is not detected, it is returned.
if (!isset ($Urlarr [host])) {
return false;
}
We use intelligent methods to define header times.
foreach (@getallheaders () as $key = = $val) {
$key ===host && $val = $Urlarr [Host];
$key ===referer && $val =http://. $Urlarr [Host];
$str. = "$key: $val,";
}
Virtual route.
!eregi (Referer, $str) && $str. = "referer:http://{$Urlarr [host]},";
After correction, basically, the route is the station, the host is also the URL site.
$Header = Array (Trim ($STR));
The following is just the choice of which program to use to collect.
if ($Method = = = F&&function_exists (file_get_contents)) {
$opts = Array (
Http=>array (
Method=> "GET",
Header=> $Header,
)
);
$cxContext = Stream_context_create ($opts);
$file _contents = @file_get_contents ($Url, False, $cxContext);
} elseif ($Method = = = C&&function_exists (curl_init)) {
$Ch = Curl_init ();
$Timeout = 5;
curl_setopt ($Ch, Curlopt_httpheader, $Header);
curl_setopt ($Ch, Curlopt_url, $URL);
curl_setopt ($Ch, curlopt_returntransfer,1);
curl_setopt ($Ch, Curlopt_connecttimeout, $Timeout);
$file _contents = curl_exec ($Ch);
Curl_close ($CH);
}
To make the style appear beautiful, we give it a goal.
$file _contents = Str_replace (,"", $file _contents);
Processing the most common encoding, if the target site is not encoded, the default is GBK
!preg_match (/charset= ([^<> "]*)"/isu, $file _contents, $lang) && $lang [1]=GBK;
Function_exists (mb_convert_encoding) && $file _contents = mb_convert_encoding ($file _contents,empty ($Charset )? UTF-8: $Charset, $lang [1]);
Unregister part of the code;
Unset ($URL, $lang, $Timeout, $Urlarr, $Charset);
return $file _contents;
}
Test start test with File_get_contents way
HEADER ("content-type:text/html; Charset=utf-8 ");
Http://www.xtzj.com/read-htm-tid-347550.html This is not to be collected.
$file = Get_url_content ("http://www.hao123.com", f);
$file = Strip_tags ($file,);
Preg_match_all (/(http:[^ "<>]*) >/isu, $file, $link); unset ($link [0]);
$link = $link [1];
Let's simulate the acquisition of data. Change the numbers yourself. 0-151 The following is the Curl method
$x = 10;
$file = Get_url_content ($link [$x]);
Echo $file;
?>
Write all the instructions, comments:
Have not understood the reply. I'll give the collection a little bit of knowledge.
Original address: http://bbs.phpchina.com/viewthread.php?tid=99263
http://www.bkjia.com/PHPjc/486604.html www.bkjia.com true http://www.bkjia.com/PHPjc/486604.html techarticle php function Get_url_content ($Url, $Method = c) {//Introduce the required language encoding. If not, it will default to Utf-8, not to worry. Global $Charset; $Urlarr = Parse_ URL ($URL); If the check ...