Using the PHP cURL library, you can easily and effectively capture webpages. You only need to run a script, analyze the web page you crawled, and then you can get the data you want as a program. Whether you want to retrieve part of the data from a link or an XML file and import it to the database, you may simply get the webpage content, cURL is a powerful PHP library.
Common functions of the CURL Library Function in PHP are as follows:
- Curl_close-close a curl session
- Curl_copy_handle-copy all content and parameters of a curl connection Resource
- Curl_errno-A number containing the current session error message is returned.
- Curl_error-returns a string containing the current session error message.
- Curl_exec-execute a curl session
- Curl_getinfo-obtains the information of a curl connection resource handle.
- Curl_init-initialize a curl session
- Curl_multi_add_handle-Add a separate curl handle resource to the curl batch processing session
- Curl_multi_close-close a batch processing handle Resource
- Curl_multi_exec-parse a curl batch handle
- Curl_multi_getcontent-return the obtained output text stream
- Curl_multi_info_read-obtains the transmission information of the currently resolved curl.
- Curl_multi_init-initialize a curl batch processing handle Resource
- Curl_multi_remove_handle-remove a handle resource from the curl batch processing handle.
- Curl_multi_select-Get all the sockets associated with the cURL extension, which can then be "selected"
- Curl_setopt_array-set session parameters for a curl in the form of an array
- Curl_setopt-set session parameters for a curl
- Curl_version-obtain curl-related version information
- The function curl_init () initializes a curl session. The unique parameter of the curl_init () function is optional, indicating a url address.
- The role of the curl_exec () function is to execute a curl session. The unique parameter is the handle returned by the curl_init () function.
- The function curl_close () is used to close a curl session. The only parameter is the handle returned by the curl_init () function.
Basic example
<? Php // initialize a cURL object $ curl = curl_init (); // set the URLcurl_setopt ($ curl, CURLOPT_URL, 'HTTP: // www.cmx8.cn ') to be crawled '); // set headercurl_setopt ($ curl, CURLOPT_HEADER, 1); // set the cURL parameter to ensure that the result is saved to the string or output to the screen. Curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, 1); // run cURL, request webpage $ data = curl_exec ($ curl); // close URL request curl_close ($ curl ); // display the obtained data var_dump ($ data);?>
POST Data
Two form fields are accepted. One is the phone number and the other is the text message content.
<?php$phoneNumber = '13812345678';$message = 'This message was generated by curl and php';$curlPost = 'pNUMBER=' . urlencode($phoneNumber) . '&MESSAGE=' . urlencode($message) . '&SUBMIT=Send';$ch = curl_init();curl_setopt($ch, CURLOPT_URL, 'http://www.lxvoip.com/sendSMS.php');curl_setopt($ch, CURLOPT_HEADER, 1);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);curl_setopt($ch, CURLOPT_POST, 1);curl_setopt($ch, CURLOPT_POSTFIELDS, $curlPost);$data = curl_exec();curl_close($ch);?>
Use Proxy Server
<?php$ch = curl_init();curl_setopt($ch, CURLOPT_URL, 'http://www.cmx8.cn');curl_setopt($ch, CURLOPT_HEADER, 1);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);curl_setopt($ch, CURLOPT_PROXY, 'proxy.lxvoip.com:1080');curl_setopt($ch, CURLOPT_PROXYUSERPWD, 'user:password');$data = curl_exec();curl_close($ch);?>
Simulated Logon
Simulate login to the discuz program.
<? Php/*** Curl simulated login to the discuz Program * has not yet implemented the Forum login function for enabling verification codes */! Extension_loaded ('curl') & die ('the curl extension is not loaded. '); $ discuz_url = 'HTTP: // www.lxvoip.com'; // Forum address $ login_url = $ discuz_url. '/logging. php? Action = login '; // logon page address $ get_url = $ discuz_url.'/my. php? Item = threads '; // my post $ post_fields = array (); // you do not need to modify the following two items: $ post_fields ['loginfield'] = 'username '; $ post_fields ['loginsubmit '] = 'true'; // username and password, which must be set to $ post_fields ['username'] = 'lxvoip '; $ post_fields ['Password'] = '000000'; // security question $ post_fields ['questionid'] = 0; $ post_fields ['ancer'] = ''; // @ todo Verification Code $ post_fields ['seccodeverify '] = ''; // obtain the form FORMHASH $ ch = curl_init ($ login_url); curl_setopt ($ ch, CURLOPT_HEADER, 0); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1); $ contents = curl_exec ($ ch); curl_close ($ ch ); preg_match ('/<input \ s * type = "hidden" \ s * name = "formhash" \ s * value = "(. *?) "\ S * \/>/I ', $ contents, $ matches); if (! Empty ($ matches) {$ formhash = $ matches [1];} else {die ('Not found the forumhash. ');} // POST the data to obtain the COOKIE $ cookie_file = dirname (_ FILE __). '/cookie.txt'; // $ cookie_file = tempnam ('/tmp'); $ ch = curl_init ($ login_url); curl_setopt ($ ch, CURLOPT_HEADER, 0 ); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ ch, CURLOPT_POST, 1); curl_setopt ($ ch, CURLOPT_POSTFIELDS, $ post_fields); curl_setopt ($ Ch, CURLOPT_COOKIEJAR, $ cookie_file); curl_exec ($ ch); curl_close ($ ch ); // obtain the page content that requires logon with the COOKIE obtained above $ ch = curl_init ($ get_url); curl_setopt ($ ch, CURLOPT_HEADER, 0); curl_setopt ($ ch, cursor, 0); curl_setopt ($ ch, CURLOPT_COOKIEFILE, $ cookie_file); $ contents = curl_exec ($ ch); curl_close ($ ch); var_dump ($ contents);?>