The content to be extracted is as follows:
<A href = "http://baidu.com"> http://baidu.com </a> This is the first A tag,
<A href = "http://blog.baidu.com"> Growth footprint-focus on Internet development </a> This is the second A tag.
Http://www.111cn.net this is the first URL to be extracted,
Http://blog.baidu.com this is the second URL address to be extracted '.
, which is an IMG tag
For example, the URL automatically extracted from Weibo is a hyperlink address. Extract the content marked in red to add the tag and convert it into A real hyperlink. I searched the internet for a long time and did not find a feasible solution. Most of them are simply extracted URLs (the addresses in tags A and IMG are also extracted and replaced) and cannot meet the above requirements. The regular expression does not find any way to filter out the tag during extraction. So I switched my mind to "saving the nation through the curve ". That is, replace all the and IMG tags with A unified regex, and then extract the URL address and replace it with A hyperlink, finally, replace the unified Mark restoration with the previous A and IMG labels.
The code is as follows: |
Copy code |
Function linkAdd ($ content ){ // Extract and replace all A Tags (unified tag <{link}>) Preg_match_all ('/<.*? Href = ".*? ". *?>. *? </A>/I ', $ content, $ linkList ); $ LinkList = $ linkList [0]; $ Str = preg_replace ('/<.*? Href = ".*? ". *?>. *? </A>/I ',' <{link}> ', $ content ); // Extract and replace all IMG tags (unified tag <{img}>) Preg_match_all ('/] +>/im', $ content, $ imgList ); $ ImgList = $ imgList [0]; $ Str = preg_replace ('/] +>/im', '<{img}>', $ str ); // Extract and replace the standard URL address $ Str = preg_replace ('(f | ht) {1} tp: //) [-a-zA-Z0-9 @: % _/+ .~ #? & // =] +) ',' <A href = "\ 0" target = "_ blank"> \ 0 </a> ', $ str ); // Restore the uniform A tag to the original A tag $ ArrLen = count ($ linkList ); For ($ I = 0; $ I <$ arrLen; $ I ++ ){ $ Str = preg_replace ('/<{link}>/', $ linkList [$ I], $ str, 1 ); } // Restore the IMG tag to the original IMG tag $ ArrLen2 = count ($ imgList ); For ($ I = 0; $ I <$ arrLen2; $ I ++ ){ $ Str = preg_replace ('/<{img}>/', $ imgList [$ I], $ str, 1 ); } Return $ str; } $ Content =' <A href = "http://baidu.com"> http://baidu.com </a> This is the first A tag, <A href = "http://blog.baidu.com"> Growth footprint-focus on Internet development </a> This is the second A tag. Http://www.111cn.net this is the first URL to be extracted, Http://blog.baidu.com this is the second URL to be extracted. , which is an IMG tag '; Echo linkAdd ($ content ); |
The returned content is:
<A href = "http://baidu.com"> http://baidu.com </a> This is the first A tag, <a href = "http://blog.baidu.com"> Growth footprint-focus on Internet development </a> This is the second A tag. <A href = "http://www.111cn.net" target = "_ blank"> http://www.111cn.net </a> This is the first URL to be extracted, <a href = "http://blog.baidu.com" target = "_ blank"> http://blog.baidu.com </a> is the second URL to be extracted.
, which is an IMG tag
That is, what we want.
Example 2,
The code is as follows: |
Copy code |
/** * The PHP version is modified based on the Silva code * Convert the URL address to the complete A tag link code */ /** =================================================== ========== NAME: replace_URLtolink () VERSION: 1.0 AUTHOR: J de Silva DESCRIPTION: returns VOID; handles converting URLs into clickable links off a string. TYPE: functions ========================================================== ===== */ Function replace_URLtolink ($ text ){ // Grab anything that looks like a URL... $ Urls = array (); // Build the patterns $ Scheme = '(https? : // | Ftps? ://)? '; $ Www = '([w] + .)'; $ Ip = '(d {1, 3}. d {1, 3}. d {1, 3}. d {1, 3 })'; $ Name = '([w0-9] + )'; $ Tld = '(w {2, 4 })'; $ Port = '(: [0-9] + )? '; $ The_rest = '(/? ([W #! :.? + = & % @! -/] + ))? '; $ Pattern = $ scheme. '('. $ ip. $ port. '|'. $ www. $ name. $ tld. $ port. ')'. $ the_rest; $ Pattern = '/'. $ pattern. '/is '; // Get the URLs $ C = preg_match_all ($ pattern, $ text, $ m ); If ($ c ){ $ Urls = $ m [0]; } // Replace all the URLs If (! Empty ($ urls )){ Foreach ($ urls as $ url ){ $ Pos = strpos ('http: // ', $ url ); If ($ pos & $ pos! = 0) |! $ Pos ){ $ Fullurl = 'http: // '. $ url; } Else { $ Fullurl = $ url; } $ Link = ''. $ url .''; $ Text = str_replace ($ url, $ link, $ text ); } } Return $ text; } |
Example 1: Tested. Example 2: Tested. Let's take a look at the useful method.