PHP retrieves all links on the specified URL page

Last Update:2017-05-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

: This article describes how to obtain all links on the specified URL page in PHP. For more information about PHP tutorials, see. Form: http://www.uphtm.com/php/253.html

This is actually a common practice for our developers. we used to crawl links from other websites. Today, a friend sorted out a PHP code to retrieve all the link functions on the specified URL page, let's take a look.

The following code obtains all links on the specified URL page, that is, the href attribute of all a tags:

// Obtain the HTML code of the link
$ Html = file_get_contents ('http: // www.111cn.net ');
$ Dom = new DOMDocument ();
@ $ Dom-> loadHTML ($ html );
$ Xpath = new DOMXPath ($ dom );
$ Hrefs = $ xpath-> evaluate ('/html/body // ');
For ($ I = 0; $ I <$ hrefs-> length; $ I ++ ){
$ Href = $ hrefs-> item ($ I );
$ Url = $ href-> getAttribute ('href ');
Echo $ url .'
';
}

This code will get the href attribute of all a tags, but the href attribute value is not necessarily a link. we can filter it and only keep the link address starting with http:

// Obtain the HTML code of the link
$ Html = file_get_contents ('http: // www.111cn.net ');
$ Dom = new DOMDocument ();
@ $ Dom-> loadHTML ($ html );
$ Xpath = new DOMXPath ($ dom );
$ Hrefs = $ xpath-> evaluate ('/html/body // ');
For ($ I = 0; $ I <$ hrefs-> length; $ I ++ ){
$ Href = $ hrefs-> item ($ I );
$ Url = $ href-> getAttribute ('href ');
// Keep the link starting with http
If (substr ($ url, 0, 4) = 'http ')
Echo $ url .'
';
}

The fopen () function reads all links in a specified webpage and counts the number of links. this code is applicable to some areas where the webpage content needs to be collected. In this example, Baidu homepage is read as an example, find all links on the Baidu homepage. the code has been tested and is fully available:

If (empty ($ url) $ url = "http://www.baidu.com/"; // URL of the link to be collected
$ Site = substr ($ url, 0, strpos ($ url, "/", 8 ));
$ Base = substr ($ url, 0, strrpos ($ url, "/") + 1); // Directory of the file
$ Fp = fopen ($ url, "r"); // open the url page
While (! Feof ($ fp) $ contents. = fread ($ fp, 1024 );
$ Pattern = "| href = ['\"]? ([^ '\ "] +) [' \"] | U ";
Preg_match_all ($ pattern, $ contents, $ regArr, PREG_SET_ORDER); // use regular expressions to match all href =
For ($ I = 0; $ I
If (! Eregi (": //", $ regArr [$ I] [1]) // you can determine whether a relative path exists ://
If (substr ($ regArr [$ I] [1],) = "/") // whether it is the root directory of the site
Echo "link". ($ I + 1). ":". $ site. $ regArr [$ I] [1]."
"; // Root directory
Else
Echo "link". ($ I + 1). ":". $ base. $ regArr [$ I] [1]."
"; // Current Directory
Else
Echo "link". ($ I + 1). ":". $ regArr [$ I] [1]."
"; // Relative path
}
Fclose ($ fp );
?>

Form: http://www.uphtm.com/php/253.html

The above introduces PHP to get all links in the specified URL page, including the content, hope to be helpful to friends who are interested in the PHP Tutorial.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PHP retrieves all links on the specified URL page

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support