"((HTTP|HTTPS|FTP):(////|////) ((/w) +[.]) {1,} (net|com|cn|org|cc|tv| [0-9] {1,3}) (((//[/~]*|//[/~]*) (/w) +) | [.] (/w) +) * (([?] (/w) +) {1}[=]*)) * ((/w) +) {1} ([/&] (/w) +[/=] (/w) +) *) *) "
(excluding the outside Chinese quotes), parsing: To determine whether the string is a URL, the following conditions are required.
Condition one: A common URL is the beginning of a http://, https://, or ftp://, and this part is converted to a regular expression (HTTP|HTTPS|FTP):(////|////).
Condition two: After http://must be followed by a word character (generally www), then the character "." (Such a combination must occur one or more times), and finally the domain name (NET, COM, cn, or numeric IP address, and so on), which is converted to a regular expression ((/w) +[.]) {1,} (net|com|cn|org|cc|tv| [0-9] {1,3}).
Condition three: After the full link, the next level or more level of the directory may appear, even the "~" symbol, which becomes a regular expression ((//[/~]*|//[/~]*) (/w) +) |[.] (/w) +) *.
Condition Four: The end of the link can also have parameters, as mentioned in the previous 230.
aspx&e=9690 or what? Page=2&action=display, and so on, in exchange for regular expressions (([?] (/w) +) {1}[=]*)) * ((/w) +) {1} ([/&] (/w) +[/=] (/w) +) *) *.
Above is I know in Baidu to find a section of someone else's answer. It is obvious that other people directly copy the past must not be used, because the regular expression itself is not wrong, but the inside of the two half of the comma has been changed to the whole corner of the comma, is this will certainly put someone else to do the hard work, must be thought can not use live wrong code, These code seems to make some small mistakes in which they are willing to post, the correct should be:
((HTTP|HTTPS|FTP):(////|////) ((/w) +[.]) {1,} (net|com|cn|org|cc|tv| [0-9] {1,3}) (((//[/~]*|//[/~]*) (/w) +) | [.] (/w) +) * (([?] (/w) +) {1}[=]*)) * ((/w) +) {1} ([/&] (/w) +[/=] (/w) +) *)