CURL is a powerful PHP library that uses PHP's CURL library to simply and efficiently crawl Web pages and capture content, set cookies to complete a mock login page, and CURL provide a rich function that developers can get more information about CURL from PHP manuals. This article to simulate login open source China (Oschina) For example, and you share the use of curl.
PHP's Curl () in the crawl page efficiency is relatively high, and support multithreading, and file_get_contents () efficiency will be slightly lower, of course, the use of curl need to open the curl extension.
Code Combat
First look at the code for the login section:
Copy Code code as follows:
Analog Login
function Login_post ($url, $cookie, $post) {
$curl = Curl_init ()//Initialize Curl module
curl_setopt ($curl, Curlopt_url, $url);//Login submitted address
curl_setopt ($curl, Curlopt_header, 0);//whether to display header information
curl_setopt ($curl, Curlopt_returntransfer, 0);//Whether the returned information is automatically displayed
curl_setopt ($curl, Curlopt_cookiejar, $cookie); Set cookie information to be saved in the specified file
curl_setopt ($curl, Curlopt_post, 1);//post Way to submit
curl_setopt ($curl, Curlopt_postfields, Http_build_query ($post));//information to be submitted
Curl_exec ($curl);//Execute Curl
Curl_close ($curl);//Close curl resource and free system resources
}
The function login_post () initializes the Curl_init () and then uses curl_setopt () to set the relevant option information, including the URL address to be submitted, the saved cookie file, the Post data (username and password, etc.), whether the information is returned, etc. Then curl_exec executes the curl, and finally Curl_close () releases the resource. Note that PHP's http_build_query () can convert an array into a concatenated string.
Next, if the login succeeds, we want to get the page information after the successful login.
Copy Code code as follows:
Get data after successful login
function Get_content ($url, $cookie) {
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_cookiefile, $cookie); Read cookies
$rs = curl_exec ($ch); Execute Curl Crawl page content
Curl_close ($ch);
return $rs;
}
The function get_content () also initializes the curl, then sets the relevant options, executes the curl, and frees the resource. Where we set Curlopt_returntransfer to 1 that automatically returns information, while Curlopt_cookiefile can read the cookie information saved at login, and finally return the page content.
Our ultimate goal is to get the information after the simulated login, which is only useful information that is available after a normal login is successful. Next, let's take a look at the mobile version of Open source China to see how to capture the information after the successful login.
Copy Code code as follows:
//Set post data
$post = Array (
' email ' => ' OS Account ',
' pwd ' => ' oschina password ',
' goto_page ' => '/my ',
&NB sp; ' error_page ' => '/login ',
' save_login ' => ' 1 ',
' Submit ' => ' now login '
);
//Login address
$url = "Http://m.jb51.net/action/user/login";
//Set cookie save path
$cookie = DirName (__file__). '/cookie_jb51.txt ';
//Login to get information after the address
$url 2 = "http://m.jb51.net/my";
//Analog login
Login_post ($url, $cookie, $post);
//Get the information for the login page
$content = get_content ($url 2, $cookie);
//Delete cookie file
@ unlink ($cookie);
//Matching page information
$preg = "/<td class= ' Portrait ' > (. *) <\/td>/i";
Preg_match_all ($preg, $content, $arr);
$str = $arr [1][0];
//Output
Echo $str;
After running the above code, we will see the avatar picture that finally gets to the logged-in user.
Usage Summary:
1. Initialization of Curl;
2, use curl_setopt to set the target URL, and other options;
3, Curl_exec, the implementation of curl;
4, after the execution, closes the curl;
5, output data.