Use PHP-curl

Source: Internet
Author: User
Tags how to use curl

SELF: http://lelong.iteye.com/blog/538645

This article describes the php_curl library and how to better use php_curl.

Introduction

You may encounter the following problem in your PHP script code: How can I get content from other sites? Here are several solutions. The simplest thing is to use the fopen () function in PHP, but the fopen function does not have enough parameters to use. For example, if you want to build a "web crawler ", to define the crawler client description (ie, Firefox), you can obtain the content through different request methods, such as post and get. These requirements cannot be implemented using the fopen () function.

 

To solve the problem we raised above, we can use the PHP extension library-curl. This extension library is usually in the installation package by default, and you can obtain the content of other sites, you can also do something else.

 

Note: The two codes must be supported by the php_curl extension library. View phpinfo (). If curl support is enabled, the curl library is supported.
1. Enable curl library support for PHP in Windows:
Open PHP. ini and remove the; sign before extension = php_curl.dll.

2. Enable curl library support for PHP in Linux:
Add-with-curl after./configure during PHP Compilation

In this article, let's take a look at how to use the curl library and its other functions. However, next we will start with the most basic usage.

Basic usage:

Step 1: Use the curl_init () function to create a new curl session. The Code is as follows:

<? PHP
// Create a new curl Resource
$ CH = curl_init ();
?>

We have successfully created a curl session. If you need to obtain the content of a URL, pass a URL to the curl_setopt () function in the next step. The Code is as follows:

<? PHP
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.com /");
?>

After finishing the previous step, the preparation of curl is complete, curl will get the content of the URL site and print it out. Code:

<? PHP
// Grab URL and pass it to the browser
Curl_exec ($ ch );
?>

Finally, close the current curl session

<? PHP
// Close curl resource, and free up system resources
Curl_close ($ ch );
?>

Let's take a look at the completed instance code:

<? PHP

// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.nl /");
// Grab URL and pass it to the browser
Curl_exec ($ ch );
// Close curl resource, and free up system resources
Curl_close ($ ch );
?>

(View Online Demo)
We have just obtained the content of another site and then automatically output it to the browser. Do we have other ways to organize the information and control the output content? No problem at all. In the parameters of the curl_setopt () function, if you want to obtain the content but do not output it, useCurlopt_returntransferParameter, and set it to a non-0 value/true !, Complete code can be found:

<? PHP

// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.nl /");
Curl_setopt ($ ch, curlopt_returntransfer, true );
// Grab URL, and return output
$ Output = curl_exec ($ ch );
// Close curl resource, and free up system resources
Curl_close ($ ch );
// Replace 'Google 'with 'phpit'
$ Output = str_replace ('Google ', 'phpit', $ output );
// Print Output
Echo $ output;
?>

(View Online Demo)

In the above two instances, you may notice that different results can be obtained by setting different parameters of the curl_setopt () function, which is exactly why curl is powerful, let's take a look at the meanings of these parameters.

Curl-related options:

If you have read the curl_setopt () function in the PHP manual, you can note that the following long parameter list cannot be described one by one. For more information, see the PHP manual, here we only introduce common and some parameters.

The first interesting parameter isCurlopt_followlocationWhen you set this parameter to true, curl will further obtain the redirection path based on any redirection command. For example, when you try to obtain a PHP page, then there is a jump code in this PHP page <? PHP header ("Location: http: // new_url");...?>, Curl retrieves content from http: // new_url, instead of returning the jump code. The complete code is as follows:

<? PHP

// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.com /");
Curl_setopt ($ ch, curlopt_followlocation, true );
// Grab URL, and print
Curl_exec ($ ch );
?>

(View Online Demo ),
If Google sends a redirection request, the previous example will continue to obtain the content based on the redirected URL. The two options related to this parameter are:Curlopt_maxredirsAndCurlopt_autoreferer.
ParametersCurlopt_maxredirsThis option allows you to define the maximum number of redirect requests. If you exceed this limit, you will no longer obtain the content. IfCurlopt_autorefererWhen it is set to true, the curl will automatically add the referer header in each jump link. It may not be very important, but it is very useful in some cases.

The parameters described in the next step are:Curlopt_post, This is a very useful feature, because it allows you to do this POST request, rather than get request, which actually means you can submit
Other forms of pages do not need to be filled in the form. The following example shows what I mean:

<? PHP
// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http: // projects/phpit/content/using % 20 curl % 20php/demos/handle_form.php ");
// Do a post
$ DATA = array ('name' => 'Dennis ', 'surname' => 'pallett ');
Curl_setopt ($ ch, curlopt_post, true );
Curl_setopt ($ ch, curlopt_postfields, $ data );
// Grab URL, and print
Curl_exec ($ ch );
?>

(View live demo)
And the handle_form.php file:

<? PHP
Echo 'Echo '<PRE> ';
Print_r ($ _ post );
Echo '</PRE> ';
?>

As you can see, this makes it really easy to submit forms, which is a great way to test all your forms without filling them at all times.
ParametersCurlopt_connecttimeoutIt is usually used to set the curl request link time. This is a very important option. If you set this time period too short, the curl request may fail.
However, if you set it too long, the PHP script may die. One option related to this parameter isCurlopt_timeoutThis is used to set the time required for Curl execution. If you set this small value, it may be incomplete on the downloaded webpage because it takes some time for them to download.
The last option is curlopt_useragent, which allows you to customize the client name of the request, for example, webspilder or ie6.0. the sample code is as follows:

<? PHP
// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.useragent.org /");
Curl_setopt ($ ch, curlopt_useragent, 'My M web spider/100 ′);
Curl_setopt ($ ch, curlopt_followlocation, true );
// Grab URL, and print
Curl_exec ($ ch );
?>

(View live demo)

Now we have introduced the most interesting parameter. Next we will introduce a curl_getinfo () function to see what it can do for us.

Obtain the page information:

The curl_getinfo () function allows us to obtain various information on the accept page. You can edit this information by setting the second parameter of the option, and you can also pass an array. As shown in the following example:

<? PHP
// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.com ");
Curl_setopt ($ ch, curlopt_followlocation, true );
Curl_setopt ($ ch, curlopt_returntransfer, true );
Curl_setopt ($ ch, curlopt_filetime, true );
// Grab URL
$ Output = curl_exec ($ ch );
// Print info
Echo '<PRE> ';
Print_r (curl_getinfo ($ ch ));
Echo '</PRE> ';
?>

(View live demo)

Most of the returned information is the request itself, such as the time it takes for the request, the header file information returned, and some page information, such as the size of the page content and the last modification time.

Those are all about the curl_getinfo () function. Now let's take a look at its actual usage.

Actual use:

The first purpose of the curl library is to check whether a URL page exists. We can check the code returned by the URL request to determine whether 404 indicates that this page does not exist. Let's look at some examples:

<? PHP
// Create a new curl Resource
$ CH = curl_init ();
// Set URL and other appropriate options
Curl_setopt ($ ch, curlopt_url, "http://www.google.com/does/not/exist ");
Curl_setopt ($ ch, curlopt_returntransfer, true );
// Grab URL
$ Output = curl_exec ($ ch );
// Get response code
$ Response_code = curl_getinfo ($ ch, curlinfo_http_code );
// Not found?
If ($ response_code = '000000 ′){
Echo 'page doesn \'t exist ';
} Else {
Echo $ output;
}
?>

(View live demo)

Other users may create an automatic Checker to verify whether the page for each request exists.
We can use the curl library to write web spider similar to Google, or other web spider. This article is not about how to write a web spider, so we didn't talk about any details about web spider. However, in the future, phpit will introduce how to use curl to construct a web spider.

Conclusion:

In this article, I have demonstrated how to use the curl library in PHP and most of its options.

For the most basic task, you only want to get a Web page. You may not need the curl library. However, if you want to do anything more advanced, you may want to use the curl library.

In the near future, I will tell you how to build your own web spider, similar to Google's web spider, so stay tuned to phpit.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.