PHP Curl or file_get_contents the code to get the title of the page and the stability of both efficiency _php instance

Source: Internet
Author: User
Tags curl explode php source code

PHP curl and file_get_contents functions can get the files on the remote server to save to the local, but in the performance of the two completely not the same level, let me first introduce the PHP curl or file_get_contents function application examples, And then simply tell you about some of their small differences.

Recommended method Curl Get

<?php
$c = Curl_init ();
$url = ' www.jb51.net ';
curl_setopt ($c, Curlopt_url, $url);
curl_setopt ($c, Curlopt_returntransfer, 1);
$data = curl_exec ($c);
Curl_close ($c);
$pos = Strpos ($data, ' utf-8 ');
if ($pos ===false) {$data = Iconv ("GBK", "Utf-8", $data);}
Preg_match ("/<title> (. *) <\/title>/i", $data, $title);
echo $title [1];
? >

Using file_get_contents

<?php
$content =file_get_contents ("http://www.jb51.net/");
$pos = Strpos ($content, ' utf-8 ');
if ($pos ===false) {$content = Iconv ("GBK", "Utf-8", $content);}
$postb =strpos ($content, ' <title> ') +7;
$poste =strpos ($content, ' </title> ');
$length = $poste-$postb;
Echo substr ($content, $POSTB, $length);
? >

Look at the file_get_contents performance.

1) fopen/file_get_contents each request the data in the remote URL will redo the DNS query, do not cache DNS information. However, Curl will automatically cache DNS information. Requests for a Web page or picture under the same domain name require only a single DNS query. This greatly reduces the number of DNS queries. So curl performance is much better than fopen/file_get_contents.
2 Fopen/file_get_contents When HTTP is requested, the Http_fopen_wrapper is used and will not be keeplive. and curl can. This makes curl more efficient when multiple links are requested more than once. (Set header header should be OK)
3) The Fopen/file_get_contents function is affected by the configuration of the Allow_url_open option in the php.ini file. If the configuration is turned off, the function is invalidated. The curl is not affected by this configuration.
4 curl can simulate a variety of requests, such as: Post data, form submission, and so on, users can customize the request according to their own needs. Fopen/file_get_contents can only obtain data using GET.
5) fopen/file_get_contents can not download the binary file correctly
6) Fopen/file_get_contents does not properly handle SSL requests
7) Curl can use multithreading
8 When using file_get_contents, if there is a problem with the network, it is easy to accumulate some processes here
9 If you want to make a continuous connection, multiple requests for more than one page. Then file_get_contents will go wrong. The contents may also be incorrect. So when you do some sort of collection work, there's definitely a problem. To do the acquisition of the use of curl, if there is the same do not believe that we will do a test

Curl and file_get_contents Performance comparison PHP source code is as follows:

1829.php

<?php/** * through Taobao IP interface to obtain IP location * @param string $ip * @return: string **/function Getcitycurl ($ip) {$url = "Http://ip.ta
 Obao.com/service/getipinfo.php?ip= ". $ip;
 $ch = Curl_init ();
 $timeout = 5;
 curl_setopt ($ch, Curlopt_url, $url);
 curl_setopt ($ch, Curlopt_returntransfer, 1);
 curl_setopt ($ch, Curlopt_connecttimeout, $timeout);
 $file _contents = curl_exec ($ch);
 Curl_close ($ch);
 $ipinfo =json_decode ($file _contents);
 if ($ipinfo->code== ' 1 ') {return false;
 } $city = $ipinfo->data->region. $ipinfo->data->city;
return $city;
 The function getcity ($ip) {$url = "http://ip.taobao.com/service/getIpInfo.php?ip=". $ip;
 $ipinfo =json_decode (file_get_contents ($url));
 if ($ipinfo->code== ' 1 ') {return false;
 } $city = $ipinfo->data->region. $ipinfo->data->city;
return $city;
}//For file_get_contents $startTime =explode (', microtime ());
$startTime = $startTime [0] + $startTime [1]; For ($i =1 $i <=10; $i + +) {echo getcity ("121.207.247.202"). "
</br> "; } $endTime = Explode (', microtime ());
$endTime = $endTime [0] + $endTime [1];
$totalTime = $endTime-$startTime;
Echo ' file_get_contents: '. Number_format ($totalTime, Ten, '. ', ""). "Seconds</br>";
For Curl $startTime 2=explode (", Microtime ());
$startTime 2= $startTime 2[0] + $startTime 2[1]; For ($i =1 $i <=10; $i + +) {echo getcitycurl (' 121.207.247.202 ').
</br> ";
$endTime 2 = Explode (', microtime ());
$endTime 2= $endTime 2[0] + $endTime 2[1];
$totalTime 2 = $endTime 2-$startTime 2; echo "Curl:" Number_format ($totalTime 2, '. ', ""). "Seconds";?>

Test access

File_get_contents Speed: 4.2404510975 seconds
Curl Speed: 2.8205530643 seconds
The curl is about 30% faster than the file_get_contents speed, and the most important is the lower server load.

ps:php function file_get_contents and curl efficiency and stability problem

Accustomed to using the convenient and fast file_get_contents function to crawl other site content, but always encounter the problem of getting failed, although the manual in the example set a timeout, most of the time is not good:

$config [' context '] = stream_context_create (Array (' HTTP ' => array (' mode ' => ' get ', ' timeout ' => 5));
' Timeout ' => 5//This timeout period is unstable and often does not. At this point, look at the server connection pool, you will find a bunch of errors similar to the following, you have a headache:

File_get_contents (http://***): failed to open stream ...

Reluctantly, installed the Curl library, wrote a function replacement:

function curl_get_contents ($url) 
{ 
 $ch = Curl_init (); 
 curl_setopt ($ch, Curlopt_url, $url);   Set access URL address 
 //curl_setopt ($ch, curlopt_header,1);   Whether to display head information 
 curl_setopt ($ch, Curlopt_timeout, 5);   Set timeout 
 curl_setopt ($ch, curlopt_useragent, _useragent_);//user Access Proxy 
 user-agent curl_setopt ($ch, Curlopt_ Referer,_referer_);  Set Referer 
 curl_setopt ($ch, curlopt_followlocation,1);  Tracking 
 of curl_setopt ($ch, Curlopt_returntransfer, 1);  Returns the result 
 $r = curl_exec ($ch); 
 Curl_close ($ch); 
 return $r; 

So, in addition to the real network problems, there is no more problems.

This is the test that someone else has done about curl and file_get_contents:

File_get_contents crawl google.com Required seconds:

2.31319094
2.30374217
2.21512604
3.30553889
2.30124092

Curl Time to use:

0.68719101
0.64675593
0.64326
0.81983113
0.63956594

It's a big gap, isn't it? Well, from the experience I used, these two tools are not only the speed difference, the stability is also very big difference. It is recommended that the network data capture stability requirements of high friends use the above curl_file_get_contents function, not only stable speed, but also fake browser spoofing target address Oh!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.