Hyperlink spam often occurs in online comments. We can use regular expressions for extraction.
Any hyperlink that matches www. [at least one letter or number]. [at least one letter or number]
In reality, http: // www. [at least one letter or number]. [at least one letter or number] may exist, but the phenomenon of starting part of WWW is considered.
Because the previous regular expression can include the latter, the above regular expression is used.
The Code is as follows:
Public static void main (string [] ARGs)
...{
String htmlstr = "support moderator Xie la http://www.93zt.com welcome to see ";
String patternstring = "www. [a-zA-z0-9] +. [a-zA-z0-9] + ";
Pattern pattern = pattern. Compile (patternstring );
Matcher = pattern. matcher (htmlstr );
While (matcher. Find ())
...{
Int start = matcher. Start ();
Int end = matcher. End ();
String mache = htmlstr. substring (START, end );
System. Out. println (start );
System. Out. println (end );
System. Out. println (mache );
}
}
Running result:
15
27
Www.93zt.com
Debug-single:
Generated successfully (total time: 2 seconds)