This article is provided by script 100. I hope this article will help you learn php and continue to pay attention to script 100! Www.jb100.nethtmlcontent-22-821-1.html without phpcurl commonly used 5 examples I use php, curl is mainly to capture data, of course we can use other methods to capture, such as fsoc
This article is provided by script 100. I hope this article will help you learn php and continue to pay attention to script 100! Original reprinted address: http://www.jb100.net/html/content-22-821-1.html without php curl commonly used 5 examples I use php, curl is mainly to capture data, of course we can use other methods to capture, such as fsoc
This article is provided by script 100. I hope this article will help you learn php and continue to pay attention to script 100!
Original reprinted address: http://www.jb100.net/html/content-22-821-1.html <无>
I use php for five common examples of php curl. curl is mainly used to capture data. Of course, we can use other methods to capture data, such as fsockopen and file_get_contents. However, you can only capture those pages that can be accessed directly. It is more difficult to capture pages with page access control, or to log on to pages after logon. 1. capture files without Access Control
2. Why should I use a proxy for crawling? Take google for example. If google's data is captured frequently in a short period of time, you will not be able to capture it. When google restricts your IP address, you can use another proxy to re-capture it.
3. After the post data, capture the data and submit the data separately. Because curl is used, there are many data interactions, so it is important.
Serialize (array ('tank', 'zhang '), 'sex' => 1, 'birth' => '123 ') * For example, array ('name' => array ('tank', 'zhang'), 'sex' => 1, 'birth' => '123 ') */$ data = array ('name' => 'test', 'sex' => 1, 'birth' => '123 '); curl_setopt ($ ch, CURLOPT_URL, 'HTTP: // localhost/mytest/curl/upload. php '); curl_setopt ($ ch, CURLOPT_POST, 1); curl_setopt ($ ch, CURLOPT_POSTFIELDS, $ data); curl_exec ($ ch);?> In upload. in the PHP file, print_r ($ _ POST); Use curl to capture the upload. php output content Array ([name] => test [sex] => 1 [birth] => 20101010) 4, I have previously written an article on crawling pages with page access control. If you are interested in the three methods of page access control, please take a look. If you use the method mentioned above, the following error will be reported: You are not authorized to view this pageYou do not have permission to view this directory or page using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept. at this time, we will use CURLOPT_USERPWD for verification.
5. Simulate logon to sina. We need to capture data, which may be post-Logon content. At this time, we need to use the simulated logon function of curl.
Open the cookie File under/tmp and check it # Netscape HTTP Cookie File # http://curl.haxx.se/rfc/cookie_spec.html# This file was generated by libcurl! Edit at your own risk.mail.sina.com.cn FALSE/FALSE 0 SINAMAIL-WEBFACE-SESSID failed # HttpOnly _ .sina.com.cn TRUE/FALSE 0 SUE es % failed % 26ev % 3Dv0% 26es2% failed TRUE/FALSE 0 SUP cv % 3D1% 26bt % 3D1286900433% 26et % 3D1286986833% 26lt % 3D1% 26uid % loud % 26 user % 3D % 25E5% 25BC % 25A0% 25E6% 2598% 25A02001% 26ag % 3D2% 26 name % 3Dzhangying20015% 2540sina.com % 26 nick % 3D % 25E5% 25BC % 25A0% 25E6% 2598% 25A02001% 26sex % 3D1% 26 ps % 3D0% 26 email % %2540sina.com % 26dob % 3D1982-07-18 # HttpOnly _ .sina.com.cn TRUE/FALSE 0 SID BihcallomxMx-QZxzGrOlcSQx % 2F0B % 2F0cmr. nyQ % 2F0B % failed % 40fr5ciZiGG5i # HttpOnly _ .sina.com.cn TRUE/FALSE 0 SPRIAL restart # HttpOnly _ .sina.com.cn TRUE/FALSE 0 SINA_USER % D5 % C5 % D2001