Curl retrieves a link from a website to parse the tag, some of which are complete, some are about ,.. some are relative to about. Some are like # And javascript. How can I complete all matched links as full links (domain names are relative), and eliminate the anchor and js links? Curl gets a link from the tag in the resolution of a website, some will be complete like http: //, some will be/about ,.. /about relative, some are #, javascript, and so on. How can I complete all the matched links as a complete link (domain name is added relative), and eliminate the anchor and js?
Reply content:
Curl gets a link from the tag in the resolution of a website, some will be complete like http: //, some will be/about ,.. /about relative, some are #, javascript, and so on. How can I complete all the matched links as a complete link (domain name is added relative), and eliminate the anchor and js?
Just write a method for calculation. For example, request a http://example.com/qa/list.php where the host address is a http://example.com and the Directory address is a http://example.com/qa/
If the address starts with http (s): //, the complete address
If the address starts with/, such as/aboutus, the complete address is the host address + the address, that is, the http://example.com/aboutus
If the address is another beginning, such as ../aboutus, the complete address is the directory address + this address, that is, the http://example.com/qa/../aboutus
If you think ../very out of the way, you can tidy up, each ../Offset level parent directory, into http://example.com/aboutus
/*** Return the complete URL of the current request ** @ return string */function current_url () {$ host = $ _ SERVER ['HTTP _ host']; $ uri = $ _ SERVER ['request _ URI ']; return (is_https ()? 'Https: // ': 'http: //'). $ host. $ uri ;}
Okay, I have read the wrong question...
For more information about how to deal with relative paths, see an article I wrote earlier: http://blog.icewingcc.com/php-conv-addr-re-ab-2.html.