How to Prevent XSS attacks

Source: Internet
Author: User
Tags decode all alphanumeric characters

Www.2cto.com: Old article. Xss has been on fire recently. Let's see it.

These rules apply to all different types of XSS cross-site scripting attacks. You can perform proper decoding on the server to locate the ing XSS and stored XSS. Because XSS also has many special situations, therefore, we strongly recommend that you use the decoder library. In addition, XSS-based DOM can also be located by applying these rules to untrusted data on the client. Untrusted data is usually data from HTTP requests, in the form of URL parameters, form fields, headers, or cookies. However, from a security perspective, data from databases, network servers, and other sources is often untrusted, that is, the data may not have been fully verified. Always be vigilant against untrusted data and regard it as an attack. This means that before sending untrusted data, you should take measures to determine that no attack is sent and then send it again. As the association between applications continues to deepen, attacks on downstream literal translation programs can spread rapidly. Traditionally, input verification is the best way to process untrusted data. However, input verification is not the best solution for injection attacks. First, input verification is generally executed when data is obtained, but the destination is unknown. This also means that we do not know which characters are important in the target literal translation program. Second, it may be even more important that the application must allow potentially harmful characters to enter, for example, is it because SQL considers Mr. o'malley's name contains special characters so he cannot register it in the database? Although input verification is very important, it is never a complete solution to resolve injection attacks. It is best to take input attacks as defense measures in depth and escaping as the primary defense line. Decoding (also known as Output Encoding) "Escaping" decoding technology is mainly used to ensure that characters are used as data processing rather than as Characters Related to the parser of a literal translation program. There are many different types of decoding, and sometimes the output is "decoded ". Some techniques define special "escape" characters, while others include more complex syntaxes that involve several characters. Do not confuse output Decoding with Unicode character encoding. The latter involves ing Unicode Character in-place sequences. This encoding level is usually automatically decoded, and does not ease the attack. However, if the target character set between the server and the browser is not correctly understood, communication may occur with non-target characters, resulting in Cross-Site XSS script attacks. This is also important for all communications to specify Unicode character encoding (character sets), such as UTF-8, etc. Escaping is an important tool that ensures that untrusted data cannot be used to deliver injection attacks. This will not affect data decoding, but will still be correctly displayed in the browser. decoding can only prevent attacks in operation. Injection Attack theory injection attacks are such an attack method, which mainly involves cracking the data structure and converting to a code structure by using special characters (important data being used by the literal translation program. XSS is a form of injection attack. As a literal translation program, the attack is hidden in HTML files. HTML has always been the worst mashup for code and data, because there are many possibilities for HTML to place code and many different effective codes. HTML is complex because it not only has a hierarchical structure, but also contains many different Parser (XML, HTML, JavaScript, VBScript, CSS, URL, etc ). To truly understand the relationship between injection attacks and XSS, we must carefully consider injection attacks in the html dom hierarchy. Insert data at a location in an HTML file (where the developer allows untrusted data to be included in the DOM). There are two main ways to inject code: Injecting UP, the most common method of upstream injection is to close the existing context and start a new code context. For example, when you close the HTML attribute, use "> and start a new <SCRIPT> tag. This attack will close the original context (the upper layer of the hierarchy) and start a new tag to allow script code execution. Remember, when you try to destroy the existing context, you can skip the upper layers of the hierarchy. For example, </SCRIPT> can terminate the SCRIPT block, even if the script block is injected with reference characters in the method call in the script, this is because the HTML Parser runs before the JavaScript parser. Injecting DOWN, downstream injection another uncommon way to execute XSS injection is to introduce a subcontext without shutting DOWN the current context. For example "... "> to , you do not need to avoid the HTML attribute context. Instead, you only need to introduce the context that allows scripts to be written in the src attribute. Another example is the expression () function in the CSS attribute. Although you may not be able to reference the CSS attribute for upstream injection, you can use x ss: expression (document. write (document. cookie) without leaving the existing context. It is also possible to inject directly in the existing context. For example, you can use untrusted input and put it directly into the JavaScript context. This method is more common than you think, but it is impossible to use escaping (or any other method) to ensure security. In essence, if you do this, your application will only become a channel for attackers to implant malicious code into the browser. The rules described in this article are designed to prevent upstream and downstream XSS injection attacks. To prevent upstream injection attacks, you must avoid characters that allow you to disable the existing context and start a new context. To prevent attacks from jumping to the DOM level, you must avoid all characters that may disable the context; for downstream injection attacks, you must avoid any characters that can be used to introduce new sub-context in the existing context. Active XSS defense mode this article regards HTML pages as a template with many slots on it. developers can place untrusted data in these slots. It is not allowed to place untrusted data in other places. This is a "whitelist" mode, denying all disallow. Different types of slots have different security rules based on the way the browser parses HTML. When you place untrusted data in these slots, you must take some measures to ensure that the data does not "escape" the corresponding slot and break into the context that allows code execution. In a sense, this method treats HTML documents as parameterized database queries. data is stored in a specific civilian and separated from the escaping code context. This article lists the most common slot location and secure data placement rules. Based on different requirements, known XSS carriers, and a large number of manual tests on popular browsers, we ensure that the rules proposed in this article are safe. The slot location is defined. Developers should carefully analyze any data to ensure security. Browser Parsing is tricky because many seemingly insignificant characters may play an important role. Why can't we encode all untrusted data in HTML? HTML Entity encoding can be performed on the unfeasible data that is placed in the HTML document body, such as in the <div> label. You can also perform entity encoding on the unfeasible data entering the attribute, especially when the referenced symbol is used in the attribute. However, HTML Entity encoding is not always effective, such as placing untrusted data into the <script> tag, event processor (such as onmouseover), CSS, or URL. Even if you use the HTML Entity encoding method at each location, you still cannot defend against cross-site scripting attacks. The escape syntax must be used for HTML documents that contain untrusted data, which will be discussed below. It is not very difficult for you to write an encoder in a secure encoding library, but there are also many hidden traps. For example, you may use the decoding shortcut ("in JavaScsript), but these are easily misunderstood by the nested parser in the browser, A secure dedicated decoding library should be used to ensure that these rules can be correctly executed. XSS defense rules the following rules are designed to prevent all XSS attacks that occur in applications. Although these rules do not allow arbitrary placement of untrusted data to HTML documents, they basically cover the vast majority of common cases. You do not need to adopt all rules. Many enterprises may find that the first and second rules are sufficient to meet their needs. Select rules as needed. No.1-do not insert untrusted data in permitted locations. The first rule is to reject all data. Do not place untrusted data in HTML documents unless it is the slot defined below. The reason for doing so is that there are a lot of strange context in the HTML with a decoding rule, which makes things complicated, so there is no reason to put untrusted data in these context. <Script>... NEVERPUTUNTRUSTEDDATAHERE... </script> directlyinas.pdf <! --... NEVERPUTUNTRUSTEDDATAHERE... --> insideanHTMLcomment <div... NEVERPUTUNTRUSTEDDATAHERE... = test/> inanattributename <... NEVERPUTUNTRUSTEDDATAHERE... href = "/test"/> inatagname. More importantly, do not accept JavaScript code from untrusted sources and run it. For example, a parameter named "callback" contains a JavaScript code segment, no decoding can solve the problem. No. 2-decoding HTML before inserting untrusted data into HTML element content this rule applies when you want to insert untrusted data directly somewhere in the HTML body, this includes internal normal labels (div, p, B, td, etc ). Most website frameworks use HTML Decoding and can escape the following characters. However, this is far from enough for other HTML context. You need to deploy other rules. <Body>... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... </body> <div>... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... </div> and other common HTML elements use HTML Entity decoding to avoid switching to any executed content, such as scripts, styles, or event handlers. In this type, we recommend that you use a hexadecimal entity. In addition to the five important characters (&, <,>, ", and ') in XML, a diagonal line character is added, to help end HTML objects. & --> & <--> <> --> "-->" '-->' & Apos; isnotrecommended/-->/forwardslashisincludedasithelpsendanHTMLentity ESAPI reference implementation Stringsafe = ESAPI. encoder (). encodeForHTML (request. getParameter ("input ")); no. 3-perform attribute decoding before inserting untrusted data into common HTML attributes. This rule converts untrusted data into typical attribute values (such as width, name, and value ), this cannot be used for complex attributes (such as href, src, style, or other event handlers ). This is an important rule. The event processor attribute (HTML JavaScript Data Values) must comply with this rule. <Divattr =... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE...> content </div> insideUNquotedattribute <divattr = '... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... '> content </div> insidesinglequotedattribute <divattr = "... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... "> content </div> insidedoublequotedattribute except for alphanumeric characters, all data is decoded using ASCII values less than 256 & # xHH format (or named entity) to prevent attribute switching. This rule is widely used because developers often keep attributes unreferenced and correctly referenced attributes can only be decoded using corresponding references. Unreferenced attributes can be damaged by many characters, including [space] % * +,-/; <=> ^ and |. ESAPI reference implementation String safe = ESAPI. encoder (). encodeForHTMLAttribute (request. getParameter ("input"); No. 4-JavaScript decoding before inserting untrusted Data into HTML JavaScript Data Values involves JavaScript event processors developed on different HTML elements. The only safe location for placing untrusted data on these event processors is "data value ". Placing untrusted data in these small code blocks is quite risky because it is easy to switch to the execution environment, so be careful when using it. <Script> alert ('... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... ') </script> insideaquotedstring <script> x =... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... </script> onesideofanexpression <divonmouseover =... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... </div> insideUNquotedeventhandler <divonmouseover = '... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE... '</div> insidequotedeventhandler <divonmouseover = "... ESCAPEUNTRUSTEDDATABEFOREPU TTINGHERE... "</div> In addition to letters and numbers, insidequotedeventhandler decodes all data in the xHH format of ASCII values less than 256 to prevent data values from being switched to the script content or another attribute. Do not use any decoding shortcuts (such as ") because the referenced characters may be matched by the first running HTML property parser. If the event processor is referenced, the corresponding reference is required for decoding. This rule is widely used because developers often keep the event processor unreferenced. Correct reference attributes can only be decoded using corresponding references. unreferenced attributes can be decoded using any character (including [space] % * +,-/; <=> ^ and |. At the same time, because the HTML Parser runs before the JavaScript parser, closing the tag can close the script block, even if the script block is located in the reference string. ESAPI reference implementation Stringsafe = ESAPI. encoder (). encodeForJavaScript (request. getParameter ("input"); No.5-before inserting the untrusted number to the HTML style attribute value, perform CSS decoding. When you want to put untrusted data into a style sheet or STYLE tag, you can use this rule. CSS is very powerful and can be used for many attacks. Therefore, you can only use untrusted data in attribute values, not other style data. You cannot add untrusted data to complex attributes (such as url, behavior, and custom (-moz-binding )). Similarly, the untrusted data cannot be placed into the expression attribute value that allows JavaScript IE. <Style> selector {property :... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE ...;} </style> propertyvalue <spanstyle = property :... ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE...;> text </style> propertyvalue decodes all data in HH format, which is less than 256 ASCII values, except for alphanumeric characters. Do not use any decoding shortcuts (for example, ") because the reference characters may be matched by the first running HTML property parser to prevent data values from being switched to the script content or another attribute. It also prevents switching to the value of expression or other permitted scripts. If an attribute is referenced, the corresponding reference must be decoded. All attributes should be referenced. Unreferenced attributes can be decoded using any character (including [space] % * +,-/; <=> ^ and |. At the same time, because the HTML Parser runs before the JavaScript parser, the </script> label can disable the script block, even if the script block is located in the reference string. ESAPI reference implementation Stringsafe = ESAPI. encoder (). encodeForCSS (request. getParameter ("input"); No. 6-before inserting untrusted data into the html url attribute, perform URL Decoding. You need to use this rule when you want to put untrusted data into links that are linked to other locations. This includes the href and src attributes. There are many other location attributes, but we recommend that you do not use untrusted data in these attributes. Note that untrusted Data is used in javascript, but the preceding HTML JavaScript Data Value rule can be used. <Ahref = http://...ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE ...> Link </a> anormallink Animagesource <scriptsrc =" http://...ESCAPEUNTRUSTEDDATABEFOREPUTTINGHERE ... "/> Ascriptsource uses less than 256 ASCII Value % HH to decode all data except alphanumeric characters. Protect untrusted data in the data: the URL cannot be allowed, because there is no good way to switch the URL through decoding to avoid attacks. All attributes should be referenced. Unreferenced attributes can be decoded using any character (including [space] % * +,-/; <=> ^ and |. Note that entity encoding is useless in this respect. ESAPI reference implementation Stringsafe = ESAPI. encoder (). encodeForURL (request. getParameter ("input "));

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.