PHP using curl to simulate login and get data instance _php tutorial

Source: Internet
Author: User
Tags preg set cookie
Curl is a powerful PHP library that uses PHP's curl library to easily and efficiently crawl Web pages and capture content, set a cookie to complete a mock landing page, and Curl provides a wealth of functions for developers to get more information about curl from the PHP manual. This article takes the example of open source China (Oschina), and shares the use of curl with everyone.
PHP's curl () is relatively high in terms of crawling web pages, and supports multiple threads, while file_get_contents () is less efficient and, of course, you need to turn on the curl extension when using curl.

Code Combat

First look at the code for the login section:
Copy the Code code as follows:
Analog Login
function Login_post ($url, $cookie, $post) {
$curl = Curl_init ();//Initialize the Curl module
curl_setopt ($curl, Curlopt_url, $url);//Address Submitted by login
curl_setopt ($curl, Curlopt_header, 0);//whether to display header information
curl_setopt ($curl, Curlopt_returntransfer, 0);//Whether the returned information is automatically displayed
curl_setopt ($curl, Curlopt_cookiejar, $cookie); Set cookie information to be saved in the specified file
curl_setopt ($curl, Curlopt_post, 1);//post mode submission
curl_setopt ($curl, Curlopt_postfields, Http_build_query ($post));//information to be submitted
Curl_exec ($curl);//Perform Curl
Curl_close ($curl);//Turn off the Curl resource and release the system resources
}

The function login_post () First initializes the Curl_init () and then uses curl_setopt () to set the relevant option information, including the URL address to be submitted, the cookie file to be saved, the post data (information such as user name and password), whether to return information, etc. Then curl_exec executes Curl and finally curl_close () frees the resource. Note that PHP's own http_build_query () can convert an array into a concatenated string.
Next, if the login is successful, we want to get the page information after the login is successful.
Copy the Code code as follows:
Get data after successful login
function Get_content ($url, $cookie) {
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_cookiefile, $cookie); Read cookies
$rs = curl_exec ($ch); Perform Curl Crawl page content
Curl_close ($ch);
return $rs;
}

The function get_content () also initializes the curl, then sets the relevant options, performs curl, and frees the resource. Where we set Curlopt_returntransfer to 1 automatically returns information, while Curlopt_cookiefile can read the cookie information saved at login, and finally return the content of the page.

Our ultimate goal is to obtain information after the simulated login, which is useful information that can only be obtained after the normal login is successful. Let's take a look at the mobile version of Open source China for example, and see how to crawl the information after successful login.

Copy the Code code as follows:
Set the data for the post
$post = Array (
' Email ' = ' Oschina account ',
' pwd ' = ' oschina password ',
' Goto_page ' = '/my ',
' Error_page ' = '/login ',
' Save_login ' = ' 1 ',
' Submit ' + ' login Now '
);

Login Address
$url = "Http://m.jb51.net/action/user/login";
Set Cookie Save path
$cookie = DirName (__file__). '/cookie_jb51.txt ';
Address to get information after login
$url 2 = "http://m.jb51.net/my";
Analog Login
Login_post ($url, $cookie, $post);
Get information about the login page
$content = Get_content ($url 2, $cookie);
Delete cookie File
@ unlink ($cookie);
Match page Information
$preg = "/(. *) <\/td>/i";
Preg_match_all ($preg, $content, $arr);
$str = $arr [1][0];
Output content
Echo $str;

After running the above code, we will see a picture of the avatar that eventually gets to the logged-in user.

Usage Summary:
1, initialize curl;
2, use curl_setopt to set the target URL, and other options;
3, curl_exec, perform curl;
4, after the implementation, close curl;
5, output data.

http://www.bkjia.com/PHPjc/824702.html www.bkjia.com true http://www.bkjia.com/PHPjc/824702.html techarticle Curl is a powerful PHP library that uses PHP's curl library to easily and efficiently crawl Web pages and capture content, set a cookie to complete a mock login webpage, CURL provides a rich set of functions, open ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.