Write the required function, URL complement function, can also be called FormatUrl.
Write this function is to develop the collection program, the collection of articles will often encounter the page path is "relative path" or "absolute root path" is not "absolute full path" can not collect URLs.
Therefore, this function is required to format the code, all hyperlinks are formatted again, so that you can directly collect the correct URL.
Path Knowledge popularization
Relative path: ".. /""./"or nothing in front of it.
Absolute Root path:/path/xxx.html
Absolute full path: http://www.xxx.com/path/xxx.html
Usage examples:
Copy CodeThe code is as follows:
$surl = "http://www.jb51.net/";
$gethtm = ' home solution ';
Echo FormatUrl ($gethtm, $surl);
?>
Output: Home Solutions
---------Demo Instance------------
Original Path code: http://www.newnew.cn/newnewindex.aspx
Output Demo Code: http://www.maifp.com/aaa/test.php
The following is the function code
Copy CodeThe code is as follows:
function FormatUrl ($l 1, $l 2) {
if (Preg_match_all ("/(]+src=\" ([^\ "]+) \" [^>]*>) | (] +href=\ "([^\"]+) \ "[^>]*>) | (]+src= ' ([^ ']+) ' [^>]*>) | (]+href= ' ([^ ']+) ' [^>]*>)/i ", $l 1, $regs)) {
foreach ($regs [0] as $num = + $url) {
$l 1 = str_replace ($url, Liiiil ($url, $l 2), $l 1);
}
}
return $l 1;
}
function Liiiil ($l 1, $l 2) {
if (Preg_match ("/(. *) (HREF|SRC) \= (. +?) (|\/\>|\>). */i ", $l 1, $regs)) {$I 2 = $regs [3];}
if (strlen ($I 2) >0) {
$I 1 = str_replace (CHR), "", $I 2);
$I 1 = str_replace (CHR), "", $I 1);
}else{return $l 1;}
$url _parsed = Parse_url ($l 2);
$scheme = $url _parsed["scheme"];if ($scheme! = "") {$scheme = $scheme. ":/ /";}
$host = $url _parsed["host"];
$l 3 = $scheme. $host;
if (strlen ($l 3) ==0) {return $l 1;}
$path = dirname ($url _parsed["path"]), if ($path [0]== "\ \") {$path = "";}
$pos = Strpos ($I 1, "#");
if ($pos >0) $I 1 = substr ($I 1,0, $pos);
Judging type
if (Preg_match ("/^ (http|https|ftp):(\/\/|\\\\) (([\w\/\\\+\-~ ' @:%]) +\.) + ([\w\/\\\.\=\?\+\-~ ' @\ ':!%#]| (&) |&) +/i ", $I 1)) {return $l 1;} URL type at the beginning of HTTP to skip
ElseIf ($I 1[0]== "/") {$I 1 = $l 3. $I 1;} Absolute path
ElseIf (substr ($I 1,0,3) = = ". /") {//relative path
while (substr ($I 1,0,3) = = ". /"){
$I 1 = substr ($I 1,strlen ($I 1)-(strlen ($I 1)-3), strlen ($I 1)-3);
if (strlen ($path) >0) {
$path = DirName ($path);
}
}
$I 1 = $l 3. $path. " /". $I 1;
}
ElseIf (substr ($I 1,0,2) = = "./") {
$I 1 = $l 3. $path. substr ($I 1,strlen ($I 1)-(strlen ($I 1)-1), strlen ($I 1)-1);
}
ElseIf (Strtolower (substr ($I 1,0,7)) = = "mailto:" | | Strtolower (substr ($I 1,0,11)) = = "javascript:") {
return $l 1;
}else{
$I 1 = $l 3. $path. " /". $I 1;
}
Return Str_replace ($I 2, "\" $I 1\ "", $l 1);
}
?>
The following links are places to learn about PHP regular expressions. Leave a link here to prevent loss ...
http://www.bkjia.com/PHPjc/325775.html www.bkjia.com true http://www.bkjia.com/PHPjc/325775.html techarticle write the required function, URL complement function, can also be called FormatUrl. Write this function is to develop the collection program, the collection of articles will often encounter the path in the page is "phase ...