Asp.net UDFs filter HTML tags with only line breaks and spaces

Source: Internet
Author: User
Tags httpcontext


I found a method to filter HTML tags from the internet. I don't know who is the original one. Most of them are the same. I copied the method and the code is as follows:

The code is as follows: Copy code
/// <Summary>
/// Remove HTML tags
/// </Summary>
/// <Param name = "NoHTML"> including the source code of HTML </param>
/// <Returns> Removed text </returns>
Public static string NoHTML (string Htmlstring)
{
// Delete the script
Htmlstring = Regex. Replace (Htmlstring, @ "<script [^>] *?>. *? </Script> ","",
RegexOptions. IgnoreCase );
// Delete HTML
Htmlstring = Regex. Replace (Htmlstring, @ "<(. [^>] *)> ","",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "([rn]) [s] + ","",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "-->", "", RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "<! --. * "," ", RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (quot | #34 );",""",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (amp | #38 );","&",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (lt | #60);", "<",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (gt | #62);", "> ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (nbsp | #160 );","",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (iexcl | #161);", "xa1 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (cent | #162);", "xa2 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (pound | #163);", "xa3 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (copy | #169);", "xa9 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& # (d + );","",
RegexOptions. IgnoreCase );

Htmlstring. Replace ("<","");
Htmlstring. Replace ("> ","");
Htmlstring. Replace ("rn ","");
Htmlstring = HttpContext. Current. Server. HtmlEncode (Htmlstring). Trim ();

Return Htmlstring;
}


The above code is directly copied from the internet. This can indeed filter out all HTML tags, but this is not what I want and it is too clean, if I use the textarea input box, I want to keep spaces and line breaks.

Then I changed this method myself. The newline of textarea is n, so I have to replace these tags with <br>. In this way, when reading the page from the database, the line feed is correct, and the space is replaced with the HTML space character.

The code is as follows: Copy code
/// <Summary>
/// Remove the HTML tag (retain the br and rn) (this method was changed from "blog garden" to "three-volume Tianshu)
/// </Summary>
/// <Param name = "NoHTML"> including the source code of HTML </param>
/// <Returns> Removed text </returns>
Public static string NewNoHTML (string Htmlstring)
{
// Htmlstring. replace ("rn", "% r % n "). replace ("<br>", "% br % "). replace ("<br/>", "% br & % "). replace ("n", "% n ");
// Delete the script
Htmlstring = Regex. Replace (Htmlstring, @ "<script [^>] *?>. *? </Script> ","",
RegexOptions. IgnoreCase );
// Delete HTML
Htmlstring = Regex. Replace (Htmlstring, @ "<(. [^>] *)> ","",
RegexOptions. IgnoreCase );
  
Htmlstring = Regex. Replace (Htmlstring, @ "-->", "", RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "<! --. * "," ", RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (quot | #34 );",""",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (amp | #38 );","&",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (lt | #60);", "<",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (gt | #62);", "> ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (nbsp | #160 );","",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (iexcl | #161);", "xa1 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (cent | #162);", "xa2 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (pound | #163);", "xa3 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& (copy | #169);", "xa9 ",
RegexOptions. IgnoreCase );
Htmlstring = Regex. Replace (Htmlstring, @ "& # (d + );","",
RegexOptions. IgnoreCase );

Htmlstring. Replace ("<","");
Htmlstring. Replace ("> ","");
// Htmlstring. Replace ("rn ","");
Htmlstring = HttpContext. Current. Server. HtmlEncode (Htmlstring );
Htmlstring = Regex. Replace (Htmlstring, @ "(rn)", "<br> ");
Htmlstring = Regex. Replace (Htmlstring, @ "(r | n)", "<br> ");
Htmlstring = Regex. Replace (Htmlstring, @ "(s)", "& nbsp ;");
Return Htmlstring;
}


This filter can be used for filtering when users enter the published content. If there are any deficiencies, Please criticize and correct them!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.