Some research on using regular expressions to filter scripts (Asp.net + C #)

Source: Internet
Author: User
When creating some websites (especially BBS and so on), users are often required to enter HTML-style code, but scripts are not allowed to run, in order to enrich the webpage style, malicious code execution is prohibited.
Of course, the htmlencode and htmldecode methods cannot be used, because the basic HTML code cannot be connected.
I did not find a good solution for searching on the Internet, but I collected some examples of script Attacks:
1. <SCRIPT> mark the Code contained in
2. Code in <a href = javascript :...
3. Code in the on... event of other basic controls
4. attacks caused by loading other pages in IFRAME and frameset

With these materials, things are much simpler. Write a simple method and replace the above Code with the regular expression:
Public String wipescript (string HTML)
{
System. text. regularexpressions. regEx regex1 = new system. text. regularexpressions. regEx (@ "<SCRIPT [/S] + </script *>", system. text. regularexpressions. regexoptions. ignorecase );
System. text. regularexpressions. regEx regex2 = new system. text. regularexpressions. regEx (@ "href * = * [/S] * script *:", system. text. regularexpressions. regexoptions. ignorecase );
System. text. regularexpressions. regEx regex3 = new system. text. regularexpressions. regEx (@ "on [/S] * =", system. text. regularexpressions. regexoptions. ignorecase );
System. text. regularexpressions. regEx regex4 = new system. text. regularexpressions. regEx (@ "<IFRAME [/S] + </iframe *>", system. text. regularexpressions. regexoptions. ignorecase );
System. text. regularexpressions. regEx regex5 = new system. text. regularexpressions. regEx (@ "<frameset [/S] + </frameset *>", system. text. regularexpressions. regexoptions. ignorecase );
Html = regex1.replace (HTML, ""); // filter <SCRIPT> </SCRIPT> tags
Html = regex2.replace (HTML, ""); // filter href = javascript: (<A>) attributes
Html = regex3.replace (HTML, "_ disibledevent ="); // filter events of other controls.
Html = regex4.replace (HTML, ""); // filter IFRAME
Html = regex5.replace (HTML, ""); // filter frameset
Return HTML;
}
This method may contain the HTML code of the script, and The Returned Code is the clean code.
I have done some simple tests and can meet the requirements, but I still have a few questions:
Are there other script attacks?
Is there any other better solution?
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.