The regular expression of all URLs to verify whether the returned URL complies with rfc1738.

Source: Internet
Author: User
Tags gopher

From: http://www.pin5i.com/showtopic-25932.html

When I didn't know rfc1738, I always thought that the regular expression of the URL is very simple. I didn't expect that there are so many types of URLs, but I didn't even think that a common HTTP regular expression is not that simple.

The following is a regular HTTP expression I found:

    1. Http: // ([\ W-] + \.) + [\ W-] + (/[\ W -./? % & =] *)?

CopyCode

Of course, this has already met the needs of most people, but if strict verification is required, it still needs to comply with rfc1738.

URLs include HTTP, FTP, news, nntpurl, telnet, Gopher, WAIS, mailto, file, prosperurl, and otherurl.

Well, I don't need to talk about it anymore.

  1. # Region HTTP
  2. String lowalpha = @ "[A-Z]";
  3. String hialpha = @ "[A-Z]";
  4. String alpha = string. Format (@ "({0} | {1})", lowalpha, hialpha );
  5. String digit = @ "[0-9]";
  6. String safe = @ "(\ $ |-| _ | \. | \ + )";
  7. String extra = @"(! | \ * | '| \ (| \) | ,)";
  8. String hex = string. format (@ "({0} | A | B | c | d | E | f)", digit );
  9. String escape = string. Format (@ "(% {0} {0})", Hex );
  10. String unreserved = string. Format (@ "({0} | {1} | {2} | {3})", Alpha, digit, safe, extra );
  11. String uchar = string. Format (@ "({0} | {1})", unreserved, escape );
  12. String reserved = @ "(; |/| \? |: | @ | & | = )";
  13. String xchar = string. Format (@ "({0} | {1} | {2})", unreserved, reserved, escape );
  14. String digits = string. Format (@ "({0} +)", digit );
  15. String alphadigit = string. Format (@ "({0} | {1})", Alpha, digit );
  16. String domainlabel = string. Format (@ "({0} | {0} ({0} |-) * {0})", alphadigit );
  17. String toplabel = string. Format (@ "({0} | {0} ({1} |-) * {1})", Alpha, alphadigit );
  18. String hostname = string. Format (@ "({0} \.) * {1})", domainlabel, toplabel );
  19. String hostnumber = string. Format (@ "{0} \. {0} \. {0} \. {0}", digits );
  20. String host = string. Format (@ "({0} | {1})", hostname, hostnumber );
  21. String Port = digits;
  22. String hostport = string. Format (@ "({0} (: {1}) {0, 1})", host, Port );
  23. String hsegment = string. Format (@ "({0} |; |: | @ | & | =) *)", uchar );
  24. String search = string. Format (@ "({0} |; |: | @ | & | =) *)", uchar );
  25. String hpath = string. Format (@ "{0} (/{0}) *", hsegment );
  26. String httpurl = string. Format (@ "http: // {0} (/{1 }(\? {2}) {0, 1} {0, 1} ", hostport, hpath, search );
  27. # Endregion

Copy code

    1. # Region FTP
    2. String user = string. Format (@ "({0} |;| \? | & | =) *) ", Uchar );
    3. String Password = string. Format (@ "({0} |;| \? | & | =) *) ", Uchar );
    4. String login = string. format (@ "({0} (: {1}) {0, 1} @) {0, 1} {2})", user, password, hostport );
    5. String fsegment = string. Format (@ "({0} | \? |: | @ | & | =) *) ", Uchar );
    6. String ftptype = @ "(a | I | d | A | I | D )";
    7. String fpath = string. Format (@ "({0} (/{0}) *)", fsegment );
    8. String ftpurl = string. format (@ "ftp: // {0} (/{1} (; type = {2}) {0, 1}) {0, 1}", login, fpath, ftptype );
    9. # Endregion

Copy code

    1. # Region news
    2. String group = string. format (@ "({0} ({0} | {1} |-| \. | \ + | _) *) ", Alpha, digit );
    3. String article = string. Format (@ "({0} |; |/| \? |: | & | =) + @ {1}) ", uchar, host );
    4. String grouppart = string. Format (@ "(\ * | {0} | {1})", group, article );
    5. String newsurl = string. Format (@ "(News: {0})", grouppart );
    6. # Endregion

Copy code

    1. # Region nntpurl
    2. String nntpurl = string. Format (@ "NNTP: // {0}/{1} (/{2}) {0, 1}", hostport, group, digits );
    3. # Endregion

Copy code

    1. # Region Telnet
    2. String telneturl = string. Format (@ "telnet: // {0}/{0, 1}", login );
    3. # Endregion

Copy code

    1. # Region Gopher
    2. String gtype = xchar;
    3. String selector = string. Format (@ "({0} *)", xchar );
    4. String gopherplus_string = string. Format (@ "({0} *)", xchar );
    5. String gopherurl = string. format (@ "Gopher: // {0} (/({1} ({2} (% 09 {3} (% 09 {4}) {0, 1 }}) {0, 1}) {0, 1}) {0, 1} ", hostport, gtype, selector, search, gopherplus_string );
    6. # Endregion

Copy code

    1. # region wais
    2. string database = string. Format (@ "({0} *)", uchar);
    3. string wtype = string. Format (@ "({0} *)", uchar);
    4. string wpath = string. Format (@ "({0} *)", uchar);
    5. string waisdatabase = string. Format (@ "(WAIS: // {0}/{1})", hostport, database);
    6. string waisindex = string. Format (@ "(WAIS: // {0}/{1 }\? {2}) ", hostport, database, search);
    7. string waisdoc = string. format (@ "(WAIS: // {0}/{1}/{2}/{3})", hostport, database, wtype, wpath);
    8. string waisurl = string. Format (@ "{0} | {1} | {2}", waisdatabase, waisindex, waisdoc);
    9. # endregion

Copy code

    1. # Region mailto
    2. String encoded822addr = string. Format (@ "({0} +)", xchar );
    3. String mailtourl = string. Format (@ "mailto: {0}", encoded822addr );
    4. # Endregion

Copy code

    1. # Region File
    2. String fileurl = string. Format (@ "file: // ({0} {0, 1} | localhost)/{1}", host, fpath );
    3. # Endregion

Copy code

    1. # Region prosperourl
    2. String fieldname = string. Format (@ "({0} | \? |: | @ | &) ", Uchar );
    3. String fieldvalue = string. Format (@ "({0} | \? |: | @ | &) ", Uchar );
    4. String fieldspec = string. Format (@ "(; {0} = {1})", fieldname, fieldvalue );
    5. String required gment = string. Format (@ "({0} | \? |: | @ | & | =) *) ", Uchar );
    6. String ppath = string. Format (@ "({0} (/{0}) *)", required gment );
    7. String prosperourl = string. Format (@ "Prospero: // {0}/{1} ({2}) *", hostport, ppath, fieldspec );
    8. # Endregion

Copy code

    1. # Region otherurl
    2. // Otherurl equal genericurl
    3. String urlpath = string. Format (@ "({0}) *)", xchar );
    4. String scheme = string. Format (@ "({0} | {1} | \ + |-| \.) +)", lowalpha, digit );
    5. String ip_schemepar = string. Format (@ "(// {0} (/{1}) {0, 1})", login, urlpath );
    6. String schemepart = string. Format (@ "({0}) * | {1})", xchar, ip_schemepar );
    7. String genericurl = string. Format (@ "{0 }:{ 1}", scheme, schemepart );
    8. String otherurl = genericurl;
    9. # Endregion

Copy code

With pattern, the rest is much simpler. It is nothing more than regular expression verification. Take HTTP as an example:

The pattern of HTTP is string httpurl. If the URL to be verified is URL, the URL verification code is as follows:

    1. RegEx = new RegEx (httpurl );
    2. Bool ismatchhttp = RegEx. ismatch (URL );

Copy code

(Text/pangxiaoliang [BEIJING] wanderer)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.