php的curl常用的5個例子

來源:互聯網
上載者:User

1,抓取無存取控制檔案

  <?php      $ch= curl_init();      curl_setopt($ch, CURLOPT_URL,"http://localhost/mytest/phpinfo.php");      curl_setopt($ch, CURLOPT_HEADER, false);      curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//如果把這行注釋掉的話,就會直接輸出      $result=curl_exec($ch);      curl_close($ch);    ?>

2,使用代理進行抓取

為什麼要使用代理進行抓取呢?以google為例吧,如果去抓google的資料,短時間內抓的很頻繁的話,你就抓取不到了。google對你的ip地址做限制這個時候,你可以換代理重新抓。

<?php      $ch= curl_init();      curl_setopt($ch, CURLOPT_URL,"http://www.php.cn");      curl_setopt($ch, CURLOPT_HEADER, false);      curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);      curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);      curl_setopt($ch, CURLOPT_PROXY, 125.21.23.6:8080);      //url_setopt($ch, CURLOPT_PROXYUSERPWD, 'user:password');如果要密碼的話,加上這個      $result=curl_exec($ch);      curl_close($ch); ?>

3,post資料後,抓取資料

單獨說一下資料提交資料,因為用 curl的時候,很多時候會有資料互動的,所以比較重要的。

   <?php      $ch= curl_init();      /*在這裡需要注意的是,要提交的資料不能是二維數組或者更高     *例如array('name'=>serialize(array('tank','zhang')),'sex'=>1,'birth'=>'20101010')     *例如array('name'=>array('tank','zhang'),'sex'=>1,'birth'=>'20101010')這樣會報錯的*/      $data=array('name'=>'test','sex'=>1,'birth'=>'20101010');      curl_setopt($ch, CURLOPT_URL,'http://localhost/mytest/curl/upload.php');      curl_setopt($ch, CURLOPT_POST, 1);      curl_setopt($ch, CURLOPT_POSTFIELDS,$data);      curl_exec($ch);    ?>

在 upload.php檔案中,print_r($_POST);利用curl就能抓取出upload.php輸出的內容Array ( [name] => test [sex] => 1 [birth] => 20101010 )

4,抓取一些有頁面存取控制的頁面

以前寫過一篇,頁面存取控制的3種方法有興趣的可以看一下。

如果用上面提到的方法抓的話,會報以下錯誤

You are not authorized to view this page

Youdonot have permission to view this directoryorpage using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept.

這個時候,我們就要用CURLOPT_USERPWD來進行驗證了

<?php      $ch= curl_init();      curl_setopt($ch, CURLOPT_URL,"http://phpcn");      /*CURLOPT_USERPWD主要用來破解頁面存取控制的     *例如平時我們所以htpasswd產生頁面控制等。*/      //curl_setopt($ch, CURLOPT_USERPWD, '231144:2091XTAjmd=');      curl_setopt($ch, CURLOPT_HTTPGET, 1);      curl_setopt($ch, CURLOPT_REFERER,"http://club-china");      curl_setopt($ch, CURLOPT_HEADER, 0);      $result=curl_exec($ch);      curl_close($ch); ?>

5,類比登入到sina

我們要抓取資料,可能是登入以後的內容,這個時候我們就要用到curl的類比登入功能了。

<?php             functionchecklogin($user,$password)      {      if( emptyempty($user) || emptyempty($password) )      {      return0;      }      $ch= curl_init( );      curl_setopt($ch, CURLOPT_REFERER,"http://mail.sina.com.cn/index.html");      curl_setopt($ch, CURLOPT_HEADER, true );      curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );      curl_setopt($ch, CURLOPT_USERAGENT, USERAGENT );      curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIEJAR );      curl_setopt($ch, CURLOPT_TIMEOUT, TIMEOUT );      curl_setopt($ch, CURLOPT_URL,"http://mail.sina.com.cn/cgi-bin/login.cgi");      curl_setopt($ch, CURLOPT_POST, true );      curl_setopt($ch, CURLOPT_POSTFIELDS,"&logintype=uid&u=".urlencode($user)."&psw=".$password);      $contents= curl_exec($ch);      curl_close($ch);      if( !preg_match("/Location: (.*)\\/cgi\\/index\\.php\\?check_time=(.*)\n/",$contents,$matches) )      {      return0;      }else{      return1;      }      }              define("USERAGENT",$_SERVER['HTTP_USER_AGENT'] );      define("COOKIEJAR", tempnam("/tmp","cookie") );      define("TIMEOUT", 500 );              echochecklogin("zhangying215","xtaj227");      ?>  開啟/tmp下面的cookie檔案看一下 # Netscape HTTP Cookie File# http://curl.haxx.se/rfc/cookie_spec.html# This file was generated by libcurl! Edit at your own risk. mail.sina.com.cn    FALSE    /    FALSE    0    SINAMAIL-WEBFACE-SESSID    65223c4bd8900284ed463d2a3e1ac182#HttpOnly_.sina.com.cn    TRUE    /    FALSE    0    SUE    es%3D8d96db0820c6c79922ad57d422f575e8%26ev%3Dv0%26es2%3Dcddfb8400dc5ca95902367ddcd7f57dd.sina.com.cn    TRUE    /    FALSE    0    SUP    cv%3D1%26bt%3D1286900433%26et%3D1286986833%26lt%3D1%26uid%3D1445632344%26user%3D%25E5%25BC%25A0%25E6%2598%25A02001%26ag%3D2%26name%3Dzhangying20015%2540sina.com%26nick%3D%25E5%25BC%25A0%25E6%2598%25A02001%26sex%3D1%26ps%3D0%26email%3Dzhangying20015%2540sina.com%26dob%3D1982-07-18#HttpOnly_.sina.com.cn    TRUE    /    FALSE    0    SID    BihcallomxMx-QZxzGrOlcSQx%2F0B%2F0cmr.NyQ%2F0B%2FcmGGalmarlmcHrcGlSmrmxmfxal_CBZ%2F_afugCmmGirBYHm0Bc%40fr5ciZiGG5i#HttpOnly_.sina.com.cn    TRUE    /    FALSE    0    SPRIAL    bfb4102951fd5892a3fd5b42d442cd26#HttpOnly_.sina.com.cn    TRUE    /    FALSE    0    SINA_USER    %D5%C5%D2001
  • 聯繫我們

    該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

    如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.