PHP uses Curl to simulate logon and obtain data Examples

Source: Internet
Author: User
Tags curl preg

PHP's curl () is highly efficient in capturing web pages and supports multithreading, while file_get_contents () is less efficient. Of course, curl extension must be enabled when curl is used.
Code practice

First, let's look at the logon code:

The code is as follows: Copy code

// Simulate logon
Function login_post ($ url, $ cookie, $ post ){
$ Curl = curl_init (); // initialize the curl module
Curl_setopt ($ curl, CURLOPT_URL, $ url); // address submitted for logon
Curl_setopt ($ curl, CURLOPT_HEADER, 0); // whether to display header information
Curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, 0); // whether to automatically display the returned information
Curl_setopt ($ curl, CURLOPT_COOKIEJAR, $ cookie); // Set the Cookie information to save in the specified file
Curl_setopt ($ curl, CURLOPT_POST, 1); // post method submission
Curl_setopt ($ curl, CURLOPT_POSTFIELDS, http_build_query ($ post); // information to be submitted
Curl_exec ($ curl); // execute cURL
Curl_close ($ curl); // closes the cURL resource and releases the system resource.
}

The login_post () function first initializes curl_init (), and then uses curl_setopt () to set related options, including the url address to be submitted and the cookie file to be saved, post data (username, password, and other information), whether to return information, etc. Then, curl_exec executes curl, and finally curl_close () releases the resource. Note that the http_build_query () in PHP can convert the array into a connected string.

Next, if the logon succeeds, we need to obtain the page information after successful logon.

The code is as follows: Copy code

// Obtain data after successful logon
Function get_content ($ url, $ cookie ){
$ Ch = curl_init ();
Curl_setopt ($ ch, CURLOPT_URL, $ url );
Curl_setopt ($ ch, CURLOPT_HEADER, 0 );
Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 );
Curl_setopt ($ ch, CURLOPT_COOKIEFILE, $ cookie); // read cookie
$ Rs = curl_exec ($ ch); // execute cURL to capture the page content
Curl_close ($ ch );
Return $ rs;
}

The function get_content () also initializes curl first, then sets related options, executes curl, and releases resources. Here, we set CURLOPT_RETURNTRANSFER to 1 to automatically return information, while CURLOPT_COOKIEFILE can read the cookie information saved during logon, and finally return the page content.

Our ultimate goal is to obtain information after simulated logon, that is, useful information that can be obtained only after successful normal logon. Next we will take logging on to the open-source China Mobile edition as an example to see how to capture the information after successful login.

/

The code is as follows: Copy code
/Set post data
$ Post = array (
'Email '=> 'oschina account ',
'Pwd' => 'osschina password ',
'Goto _ page' => '/my ',
'Error _ page' => '/login ',
'SAVE _ login' => '1 ',
'Submit '=> 'Login now'
);
 
// Logon address
$ Url = http://www.111cn.net;
// Set the cookie storage path
$ Cookie = dirname (_ FILE _). '/cookie_oschina.txt ';
// The address for obtaining information after logon
$ Url2 = "http://m.oschina.net/my ";
// Simulate logon
Login_post ($ url, $ cookie, $ post );
// Obtain the logon page information
$ Content = get_content ($ url2, $ cookie );
// Delete the cookie file
@ Unlink ($ cookie );
// Match the page information
$ Preg = "/<td class = 'portrait'> (. *) </td>/I ";
Preg_match_all ($ preg, $ content, $ arr );
$ Str = $ arr [1] [0];
// Output Content
Echo $ str;

After running the above code, we will see the final picture of the login user's profile picture.

Usage summary

1. Initialize curl;

2. Use curl_setopt to set the target url and other options;

3. curl_exec: execute curl;

4. Disable curl after execution;

5. Output data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.