Is it feasible to obtain the first-level domain name code of various URLs through regular expressions? At the end of this post, changjay edited many URLs from 2010-09-0723: 25: 30. & nbsp; similar to abc.abc.com & nbsp;, abc. comabc & nbsp;, www.abc.com.cn & nbsp;, & nbsp; abc.com.tw & nbsp;, www. abc. co. is it feasible to obtain the first-level domain name code of various URLs through regular expressions?
This post was last edited by changjay at 23:25:30
There are many Web sites, such as abc.abc.com, abc.com/abc, www.abc.com.cn, abc.com.tw, www. abc. co. uk, www.abc.com.jp/abc.php/id?abc (id = abc can be followed for a long time)
There are many other cases that I hope can be omnipotent.
How can I use PHP regular expressions to obtain the top-level domain names of all URLs? The result is abc.com, abc.com.cn, abc. co. uk?
The situation is complicated. I can use the following code to make a rough decision, but when the domain name contains com, net, org, gov, cc, biz, info, cn, co, the result of the regular expression is incorrect.
For example, www.cool.com is normalized to www. co.
I hope that the regular expression experts can help me modify the code and change it to the first-level domain name regular code that can be 10 thousand.
$url = $row["url"];
preg_match("#[\w-]+\.(com|net|org|gov|cc|biz|info|cn|co)(\.(cn|hk|uk))*#", $url, $match);
echo $match[0];
------ Solution --------------------
$ S = <TEXT
Abc.abc.com
Abc.com/abc
Www.abc.com.cn
Abc.com.tw
Www. abc. co. uk
Www.abc.com.jp/abc.php/id?abc
Www.cool.com
TEXT;
Foreach (split ("[\ r \ n] +", $ s) as $ url ){
Preg_match ("# [\ w-] + \. (com
------ Solution --------------------
Net
------ Solution --------------------
Org
------ Solution --------------------
Gov
------ Solution --------------------
Cc
------ Solution --------------------
Biz
------ Solution --------------------
Info
------ Solution --------------------
Cn
------ Solution --------------------
Co) \ B (\. (cn
------ Solution --------------------
Hk
------ Solution --------------------
Uk
------ Solution --------------------
Jp
------ Solution --------------------
Tw) * # ", $ url, $ match );
Echo"$ Url
". $ Match [0];
}
Abc.abc.com
Abc.com
Abc.com/abc
Abc.com
Www.abc.com.cn
Abc.com.cn
Abc.com.tw
Abc.com.tw
Www. abc. co. uk
Abc. co. uk
Www.abc.com.jp/abc.php/id?abc
Abc.com.jp
Www.cool.com
Cool.com