PHP retrieves all links on the specified URL page

Source: Internet
Author: User
: This article describes how to obtain all links on the specified URL page in PHP. For more information about PHP tutorials, see. Form: http://www.uphtm.com/php/253.html

This is actually a common practice for our developers. we used to crawl links from other websites. Today, a friend sorted out a PHP code to retrieve all the link functions on the specified URL page, let's take a look.

The following code obtains all links on the specified URL page, that is, the href attribute of all a tags:

  1. // Obtain the HTML code of the link
  2. $ Html = file_get_contents ('http: // www.111cn.net ');
  3. $ Dom = new DOMDocument ();
  4. @ $ Dom-> loadHTML ($ html );
  5. $ Xpath = new DOMXPath ($ dom );
  6. $ Hrefs = $ xpath-> evaluate ('/html/body // ');
  7. For ($ I = 0; $ I <$ hrefs-> length; $ I ++ ){
  8. $ Href = $ hrefs-> item ($ I );
  9. $ Url = $ href-> getAttribute ('href ');
  10. Echo $ url .'
    ';
  11. }

This code will get the href attribute of all a tags, but the href attribute value is not necessarily a link. we can filter it and only keep the link address starting with http:

  1. // Obtain the HTML code of the link
  2. $ Html = file_get_contents ('http: // www.111cn.net ');
  3. $ Dom = new DOMDocument ();
  4. @ $ Dom-> loadHTML ($ html );
  5. $ Xpath = new DOMXPath ($ dom );
  6. $ Hrefs = $ xpath-> evaluate ('/html/body // ');
  7. For ($ I = 0; $ I <$ hrefs-> length; $ I ++ ){
  8. $ Href = $ hrefs-> item ($ I );
  9. $ Url = $ href-> getAttribute ('href ');
  10. // Keep the link starting with http
  11. If (substr ($ url, 0, 4) = 'http ')
  12. Echo $ url .'
    ';
  13. }

The fopen () function reads all links in a specified webpage and counts the number of links. this code is applicable to some areas where the webpage content needs to be collected. In this example, Baidu homepage is read as an example, find all links on the Baidu homepage. the code has been tested and is fully available:

  1. If (empty ($ url) $ url = "http://www.baidu.com/"; // URL of the link to be collected
  2. $ Site = substr ($ url, 0, strpos ($ url, "/", 8 ));
  3. $ Base = substr ($ url, 0, strrpos ($ url, "/") + 1); // Directory of the file
  4. $ Fp = fopen ($ url, "r"); // open the url page
  5. While (! Feof ($ fp) $ contents. = fread ($ fp, 1024 );
  6. $ Pattern = "| href = ['\"]? ([^ '\ "] +) [' \"] | U ";
  7. Preg_match_all ($ pattern, $ contents, $ regArr, PREG_SET_ORDER); // use regular expressions to match all href =
  8. For ($ I = 0; $ I
  9. If (! Eregi (": //", $ regArr [$ I] [1]) // you can determine whether a relative path exists ://
  10. If (substr ($ regArr [$ I] [1],) = "/") // whether it is the root directory of the site
  11. Echo "link". ($ I + 1). ":". $ site. $ regArr [$ I] [1]."
    "; // Root directory
  12. Else
  13. Echo "link". ($ I + 1). ":". $ base. $ regArr [$ I] [1]."
    "; // Current Directory
  14. Else
  15. Echo "link". ($ I + 1). ":". $ regArr [$ I] [1]."
    "; // Relative path
  16. }
  17. Fclose ($ fp );
  18. ?>

Form: http://www.uphtm.com/php/253.html

The above introduces PHP to get all links in the specified URL page, including the content, hope to be helpful to friends who are interested in the PHP Tutorial.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.