Regular expression matching URL

Source: Internet
Author: User

Recently in doing some of the URL matching items, the regular expression of the understanding of not much, the internet search a lot of predecessors wrote the regular expression, found that I can not use, is the regular expression of my own to post, for you to reference.

%< I do is ASP.net project, the content is to block a text box in some url>%

First, the regular expression:

String check = @ "(HTTP|FTP|HTTPS)://) ([a-za-z0-9\._-]+\.[ a-za-z]{2,6}) | ([0-9]{1,3}\. [0-9] {1,3}\. [0-9] {1,3}\. [0-9] {1,3})) (: [0-9]{1,4}) * (/[a-za-z0-9\&%_\./-~-]*)? ";

A description of the regular expression:

①: The string that matches the regular expression must begin with http://, https://, ftp://;

②: The regular expression can match a URL or an IP address; (e.g. http://www.baidu.com or http://192.168.1.1)

③: The regular expression can match to the end of the URL, that is, it can match to the child URL; (If you can match: http://www.baidu.com/s?wd=a&rsv_spt=1&issp=1&rsv_bp=0&ie= utf-8&tn=baiduhome_pg&inputt=1236)

④: The regular expression can match the port number;


Block some of the specified URLs:

If we want to block http://in the text box we entered www.baidu.com This URL, the traditional method is to use the above regular expression to match the URL in the text box, read all the URLs and then compare them to the URLs you want to block, but there's a downside to this is that the URL we read out is up to the sub URL, and we might write a parent URL in the configuration file, which Sample to the check out of the URL to cut, plus the site's default port number is: 80, we have to compare the port number and so on, I came up with a new way:

read the URL to block from the configuration file and form a regular expression that matches the text box and blocks it if it can be matched.

The configuration file should write: <add key= "Domaincheckblackurl" value= "baidu.com"/>

Implemented in code:

Now a regular expression consists of 3 parts:

1: The beginning of a regular expression, may consist of arbitrary characters;

2: The middle part of the regular expression: the part read from the configuration file;

3: The end of the regular expression: there may be subdirectories or port numbers, etc.;

First, read out the url:string[] serverlist = configurationmanager.appsettings["Domaincheckblackurl" from the configuration file. Split (', '); (in the configuration file with "," split)

Second, string start = @ ((HTTP|FTP|HTTPS)://) ([a-za-z0-9_-]+\.) *"; (beginning of regular expression)

Then, at the end of the regular expression: ending = @ "(: [0-9]{1,4})?" ((/[a-za-z0-9\&%_\./-~-]*) | (=[^a-za-z0-9\.]) ";

Regular expression after combination: string check = start + @ ((? <=[^a-za-z0-9]) ("+ Cutstr +")) "+ End;"


These are some of my small ideas, I hope to be helpful to you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.