This article mainly introduces PHP's methods for getting all links to a Web page, involves PHP's curl-based webpage-based operation skills, and provides the demo source code for readers to download and reference, for more information, see the example in this article. We will share this with you for your reference. The details are as follows:
Function getHtml ($ url, $ charset = 'utf-8') {$ curl = curl_init (); // curl_setopt ($ curl, CURLOPT_HTTPHEADER, array ('x-FORWARDED-FOR: 192.168.168.1 ', 'client-IP: 192.168.168.1'); // IP curl_setopt ($ curl, CURLOPT_URL, $ url ); curl_setopt ($ curl, CURLOPT_REFERER, ""); // $ user_agent = isset ($ _ SERVER ['http _ USER_AGENT '])? $ _ SERVER ['http _ USER_AGENT ']: 'mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.43 Safari/100 '; curl_setopt ($ curl, CURLOPT_USERAGENT, $ user_agent); // you only need to return the HTTP header // curl_setopt ($ curl, CURLOPT_HEADER, 1 ); // the page content is not required. // curl_setopt ($ curl, CURLOPT_NOBODY, 1); // return the result instead of curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, 1 ); $ html = curl_exec ($ curl); // $ info = Curl_getinfo ($ curl); // echo var_dump ($ info); if ($ html = false) {// echo "cURL Error :". curl_error ($ ch); return '';} curl_close ($ curl); if ($ charset! = 'Utf-8') {$ html = iconv ($ charset, "UTF-8", $ html);} return $ html;} header ("Content-type: text/html; charset = utf-8 "); include ('simple _ html_dom.php '); // To Open extension = php_mbstring.dll // $ url =' http://www.baidu.com/s?wd=kaka '; $ Url =' http://www.163.com/ '; $ Str_html = getHtml ($ url, 'gbk'); $ html = str_get_html ($ str_html); $ links = $ html-> find ('A '); foreach ($ links as $ link) {$ txt = trim ($ link-> plaintext); echo $ link-> href. '['. $ txt. ']
';} $ Html = null;
Click here to download the complete instance code.