Some studies of using regular expressions to filter scripts (asp.net + C #)
Source: Internet
Author: User
Asp.net| Scripts | While doing some Web sites (especially BBS), there is often a user input HTML style code, but the need to ban the operation of the script, to achieve rich Web style, to prohibit malicious code to run.
Of course you can't use the HtmlEncode and HtmlDecode methods, because even the basic HTML code will be banned.
I searched online and didn't find a good solution, but I collected some examples of script attacks:
1. <script> tags included in the code
2. <a href=javascript: ... The code in
3. Other basic controls on ... The code in the event
4. Attacks caused by loading other pages in IFRAME and frameset
With this information, things are much simpler, write a simple method, and use regular expressions to replace the above code with the following:
public string Wipescript (string html)
{
System.Text.RegularExpressions.Regex regex1 = new System.Text.RegularExpressions.Regex (@ "<script[\s\s]+</ Script *> ", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
System.Text.RegularExpressions.Regex regex2 = new System.Text.RegularExpressions.Regex (@ "href *= *[\s\s]*script *:", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
System.Text.RegularExpressions.Regex regex3 = new System.Text.RegularExpressions.Regex (@ "on[\s\s]*=", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
System.Text.RegularExpressions.Regex regex4 = new System.Text.RegularExpressions.Regex (@ "<iframe[\s\s]+</ IFrame *> ", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
System.Text.RegularExpressions.Regex regex5 = new System.Text.RegularExpressions.Regex (@ "<frameset[\s\s]+</ Frameset *> ", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
html = regex1. Replace (HTML, ""); Filtration <script></script> Labeling
html = regex2. Replace (HTML, ""); Filter Href=javascript: (<A>) Properties
html = regex3. Replace (HTML, "_disibledevent="); Filter other controls on ... Event
html = regex4. Replace (HTML, ""); Filter iframe
html = regex5. Replace (HTML, ""); Filtration Frameset
return HTML;
}
This method enters the HTML code that may contain the script, and the return is the clean code.
I have done some simple tests that can be filled with requests, but there are still a few questions:
The above test filtration situation is more perfect, there are other script attack means?
Are there any other better solutions?
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.