Solutions to XSS attacks
In my previous article "XSS attacks of front-end security", I did not provide a complete solution to XSS attacks, and XSS attacks were so varied, are there any tricks that can be used to compete? After all, developers cannot take care of these scenarios. Today, by reading the white hat Web security book, there is a better summary of the method. There are two types: one is what the server can do, and the other is what the client can do.
Prerequisites
When talking about the XSS solution, there is a premise. Is the same-origin policy-the same-origin policy of the browser (the foundation of browser security, even the Attack Script must comply with this rule), limits the "document" or scripts from different sources, read or set certain attributes for the current "document. In addition to DOM, Cookie, and XMLHttpRequest, some third-party plug-ins loaded by browsers have their own same-source policies. However, tags such as script, img, iframe, and link can all load resources across domains without the restriction of the same-source policy.
What the server can do
1. HttpOnly
In fact, cookies can be read only through the HTTP protocol (HTTPS can also be used). JavaScript cannot read cookies. Supports IE6 +, Firefox2 +, Google, and Safari4 + browsers.
Java EE adds HttpOnly code to the Cookie:
response.setHeader("Set-Cookie","cookiename=value; Path=/;Domain=domainvalue;Max-Age=seconds;HTTPOnly");
PS: for HTTPS, you can still set the Secure field to encrypt the Cookie securely.
This is essentially not to prevent XSS, but to prevent JS from reading cookies when attacked.
2. Process Rich Text
Some data cannot be escaped directly on the server due to use cases. However, Rich Text Data semantics is a complete HTML code, which is not pieced together into the attributes of a tag during output, so it can be specially processed in special cases. The processing process is to configure the whitelist of Rich Text tags and attributes on the server. Other tags or attributes (such as scripts, iframe, and form) are not allowed, that is, "XSS Filter". Then filter before storage (the filtering principle is not proven ).
Java has an open-source project. Anti-Samy is a very good XSS Filter:
Policy ploicy = Policy.getInstance(POLICY_FILE_LOCATION);AntiSamy as = new AntiSamy();CleanResults cr = as.scan(dirtyInput, policy);MyUserDao.storeUserProfile(cr.getCleanHTML());
PS: Of course, it can also be filtered before the front-end display, but I think it is only necessary to let the front-end staff do less things, and the server only needs to switch once.
What the client can do
1. Input check
The logic of the input check must be implemented in the server code (because JavaScript is used for the input check, attackers can easily bypass it ). At present, the common practice of Web development is to simultaneously implement the same input check in the client JavaScript and server code. The JavaScript input check on the client can block most normal users with misoperations, thus saving service resources.
PS: Simply put, the input check is done by the server and the client.
In addition, attackers may enter the XSS location, for example:
1. All input boxes on the page 2. window. location (href, hash, etc.) 3. window. name 4.doc ument. referrer 5.doc ument. cookie 6. localstorage 7. data returned by XMLHttpRequest
PS: Of course.
2. Output check
Generally, when a variable is output to an HTML page, it uses encoding or escape to defend against XSS attacks. The essence of XSS is "HTML injection". User data is executed as part of HTML code, thus obfuscation of the original semantics and new semantics.
Where XSS is triggered
1.document.write2.xxx.innerHTML=3.xxx.outerHTML=4.innerHTML.replace 5.document.attachEvent 6.window.attachEvent 7.document.location.replace 8.document.location.assign
PS: If jquery is used, such as append, html, before, and after, it is actually generated when the variables are spliced to the HTML page. Most MVC frameworks automatically handle XSS problems in the template (view layer), such as AngularJS.
What encoding to escape
There are HTMLEncode and JavaScriptEncode, both on the client and on the server. However, I don't think it is very reliable to let the backend do it, because there may be several data use scenarios that can be used in tags, attributes, or scripts (or even other terminals ), simply removing encode in one way is very extreme.
1. HTMLEncode is to convert characters into HTMLEntities. Generally, the characters (&, <,>, ", ', And/) are converted.
2. JavaScriptEncode, which uses "\" to escape special characters.
PS: in my article HtmlEncode and JavaScriptEncode (XSS prevention), I summarized the complete writing of HTMLEncode and JavaScriptEncode front-end functions, as well as some examples.
Escape code needed
1. Output in HTML tags and attributes -- use HTMLEncode
2. Output in the script tag-use JavaScriptEncode
3. Output in the event -- use JavaScriptEncode
test
4. Output in CSS
Use a method similar to JavaScriptEncode. All characters except letters and numbers are encoded in hexadecimal format "\ uHH".
5. Output in the address
Generally, if the variable is the whole URL, first check whether the variable starts with "http" (if not, add http) to ensure that XSS of the pseudo protocol class does not appear.
Attack. Then URLEncode the variable.
PS: URLEncode converts the character to "% HH" format.
Summary
Front-end developers should pay attention to using the correct encoding method in the correct place. Sometimes, in order to defend against XSS, we need to combine HTMLEncode, JavaScriptEncode for encoding in one place, or even overlay, it is not a fixed encoding method (but also a specific analysis of the specific situation ).
Generally, the risk of stored XSS is higher than that of reflective XSS. Reflective XSS generally requires attackers to trick users into clicking a URL that contains the XSS code, while stored users only need to view a normal URL link. When a user opens a page, XSS Payload will be executed. In this way, vulnerabilities are extremely hidden and hidden in users' normal business, which is highly risky. (Refer to the white hat post on Web security)