How to Avoid XSS attacks for Web applications built using PHP

Source: Internet
Author: User
Tags php reader ibm developerworks

UsePHPConstructedWebHow can applications avoidXSSAttack
The development of Web 2.0 provides more opportunities for interactions between network users. Users may intentionally or unintentionally enter some destructive content by posting comments on a forum or posting comments on a blog, which causes the webpage to be unavailable and affects the use of other users. XSS is called Cross Site Scripting, because CSS has been used as the abbreviation of style sheet, so it is called XSS. XSS is a common method of website attacks. The principle is to input malicious content in the webpage input box, usually JavaScript script fragments. When these malicious inputs are submitted and read back to the client, the browser will explain and execute these malicious scripts, thus affecting the normal display of the webpage.
This article first briefly introduces how developers can conduct XSS vulnerability tests on Web applications, how to use tools to bypass the client JavaScript to verify the input of malicious data, and then for websites built using the PHP language, this article describes how to prevent malicious XSS attacks from encoding dynamic content at the output end and detecting input at the server end.

PairWebApplicationXSSVulnerability Testing
Test path
XSS vulnerability testing for WEB applications is not limited to inputting XSS attack fields on WEB pages and then submitting them. Attackers can bypass JavaScript detection and input XSS scripts, which are usually ignored by testers. Attackers can enter malicious XSS attack paths that bypass JavaScript detection.

Figure1. XSSAttack test path-BypassJavaScriptVerification


  • XSS input usually contains JavaScript scripts, such as a pop-up malicious warning box: <script> alert ("XSS"); </script>
  • XSS input may also be an HTML code segment, for example:
    • Webpages are constantly refreshed <meta http-equiv = "refresh" content = "0;">
    • Link embedded in other websites <iframe src = http: // width = 250 height = 250> </iframe>
XSS (Cross Site Scripting) Cheat Sheet maintains a list of common XSS attack scripts, which can be used as test cases to detect whether WEB applications have XSS vulnerabilities. Developers who are initially exposed to XSS attacks may not understand some of the XSS input provided in the list. The second part of this article will further explain the XSS input in different code contexts.
Test Tool
Many tools can intercept a Get/Post request in a browser. Attackers can modify the data in the request and inject malicious data into the server without JavaScript testing. The following is a list of commonly used tools used to intercept HTTP requests.
  • Paros proxy (
  • Fiddler (
  • Burp proxy (
  • TamperIE (
I have used TamperIE to test the security of WEB applications. TamperIE is small and easy to use. It can intercept Get/Post requests sent by IE browsers, and even bypass SSL encryption. However, TamperIE + IE7 is unstable. IE7 provides IPV6 support. If you do not plan to test your Web application's IPV6 support, we recommend that you use a combination of TamperIE and ie6.
2. TamperIE bypasses the JavaScript verification in the browser of the client and intercepts it when a POST request is submitted. You can modify the name and message values of the input form, for example, you can change the message value to "<script> alert (" XSS hole !! "); </Script> ", and then click" Send altered data "to Send the modified malicious data to the Web server.

Figure 2. Use TamperIE Intercept Post Request


Encode dynamic content at the output end
For a Web application, the dynamic content may come from user input, background database, hardware status change, or network information. Dynamic Content, especially the dynamic content from user input, may contain malicious data, thus affecting the normal display of webpages or executing malicious scripts. The dynamic content is securely displayed on the browser side and is related to the context of the dynamic content, for example, the dynamic content is in the attributes of the HTML body, form element, or JavaScript code segment. For a PHP-based Web application, when "echo", "print", "printf", "<? = "Indicates that dynamic content is being processed. This section describes the usage of the library function htmlspecialchars () provided by PHP. This function can convert five special HTML characters into HTML Entity Codes displayed on the webpage; next, we will introduce some common XSS attack input in the background, and how to escape and encode dynamic content at the output end to avoid XSS attacks.

Use PHP Of Htmlspecialchars () Display HTML Special characters

The malicious XSS input listed above contains some special HTML characters such as "<" and "> ". When transmitted to the client browser for display, the browser will explain how to execute these HTML or JavaScript code instead of directly displaying these strings. <> & "HTML?
The HTML character entity consists of the & symbol, entity name, or # plus entity number, and semicolon. The following is the encoding of some special characters in HTML. Some character entities only have entity numbers and do not have corresponding entity names, such as single quotes.

Table 1. Some HTML Entity encoding of special characters
Display Entity name Entity ID
< & Lt; & #60;
> & Gt; & #62;
& & Amp; & #38;
" & Quot; & Amp; #34;
' N/ & Amp; #39;
PHP provides the htmlspecialchars () function to convert special HTML characters into the character entity encoding displayed on the webpage. In this way, even if you enter various HTML tags, these HTML tags are directly displayed when you return to the browser, rather than interpreted and executed. The htmlspecialchars () function can convert the following five special HTML characters into character entity encoding:
  • & Convert to & amp;
  • "Convert to & quot;
  • <To & lt;
  • > Convert to & gt;
  • 'Convert to & #39;
When htmlspecialchars ($ str) is called directly, & "<> is escaped.
When the ENT_QUOTES flag is set, that is, htmlspecialchars ($ str, ENT_QUOTES) is called, the single quotation marks are also escaped.
When ENT_NOQUOTES is set, single quotes and double quotation marks are not escaped. That is, when htmlspecialchars ($ str, ENT_NOQUOTES) is called, only & <> is escaped.
Dynamic Content in different backgrounds XSS Attacks and Solutions
The XSS attack input is related to the Code background of the dynamic content. For example, the dynamic content is the value of the form element attribute, which is located in the HTML body or Javascript code segment.
HTML  The marked property is dynamic content
In Web applications, HTML tag attributes such as "input", "style", and "color" can be dynamic content. The "value" attribute marked by "input" is usually dynamic content.
Example 1
<Form…> <INPUT type = text name = "msg" id = "msg" size = 10 maxlength = 8
Value = "<? = $ Msg?> "> </Form>
Attack  XSS  Input
Hello "> <script> evil_script () </script>
Replace Dynamic Content
Replace $ msg with malicious XSS input:
<Form…> <INPUT type = text name = "msg" id = "msg" size = 10 maxlength = 8
Value = "Hello"> <script> evil_script () </script> "> </form>
Example 2
<Form…> <INPUT type = text name = "msg" id = "msg" size = 10
Maxlength = 8 value = <? = $ Msg?> </Form>
Attack XSS Input
Hello onmouseover = evil_script ()
Replace Dynamic Content
Replace $ msg with malicious XSS input:
<Form…> <INPUT type = text name = "msg" id = "msg" size = 10
Maxlength = 8 value = Hello onmouseover = evil_script ()> </form>
From Example 1, we can see that the XSS attack input contains special HTML characters <>"
From example 2, we can see that the XSS attack input does not contain the five HTML characters mentioned in the preceding section, but the "value" attribute value is not surrounded by double quotation marks.
Call htmlspecialchars ($ str, ENT_QUOTES) to escape the following five special HTML characters <> & 'and enclose the attribute values in double quotation marks. For example:
<Form…> <INPUT type = text name = "msg" id = "msg" size = 10
Maxlength = 8 value = "<? = Htmlspecialchars ($ msg, ENT_QUOTES)?> "> </Form>
To escape the input value, you must consider the consistency between display and data storage. That is, the data displayed on the browser and stored on the server may be inconsistent because of the escape. For example, the original data stored on the backend of the server contains the above five special characters but is not escaped. In order to prevent XSS attacks, HTML special characters are escaped during browser output:
1. When the form is submitted again, the stored content will be converted to the escaped value.
2. When JavaScript is used to operate a form element and the value of the form element needs to be used, you must consider that the value may have been escaped.

HTML  Text is dynamic content

<B> welcome: <? = $ Welcome_msg?> </B>
Attack XSS Input
<Script> evil_script () </script>
Replace Dynamic Content
Replace $ welcome_msg with malicious XSS input:
<B> welcome: <script> evil_script () </script> </B>
In the background of the HTML body, <> characters introduce HTML tags, and may consider the start of character entity encoding. Therefore, you must <> & escape
To be concise, use htmlspecialchars () to escape 5 Special HTML characters, for example:
<B> welcome: <? = Htmlspecialchars ($ welcome_msg, ENT_NOQUOTES)?> </B>
URL  The value is dynamic content.

Script/Style/Img/ActiveX/Applet/Frameset... If the src or href attribute is dynamic, make sure that these URLs do not point to malicious links.
Example 1
<Script src = <? = "$ Script_url>">
Attack XSS Input
Replace Dynamic Content
Replace $ script_url with malicious XSS input:
<Script src = "">
Example 2
Attack XSS Input
Javascript: evil_script ()
Replace Dynamic Content
Replace $ img_url with malicious XSS input:
Generally, do not control the URL value. If you need to define your own style and display effect, you cannot directly control the content of the entire URL, but provide a predefined style for you to set and assemble, then, the background program combines the Security URL output based on the user's choice.

Character Set Encoding

The browser needs to know the character set encoding to correctly display the webpage. If the character set encoding is not explicitly defined in content-type or meta, the browser will have an algorithm to guess the character set encoding of the web page. For example, the UTF-7 code for <script> alert (document. cookie) </script> is:
+ ADw-script + AD4-alert (document. cookie) + ADw-/script + AD4-
If + ADw-script + AD4-alert (document. cookie) + ADw-/script + AD4-as the dynamic content is located at the top of the page and sent to the browser side, IE will consider this page as a UTF-7 code, so that the page cannot be properly displayed.
Explicitly defines the character set encoding of a webpage, such
<Meta http-equiv = content-type content = "text/html; charset = UTF-8">
Dynamic Content is  JavaScript  Parameters of event processing functions

Parameters of JavaScript event processing functions such as onClick/onLoad/onError/onMouseOver/may contain dynamic content.
<Input type = "button" value = "go to" onClick = 'Goto _ url ("<? = $ Target_url> "); '>
Attack XSS Input
Foo & quot;); evil_script (& quot;
Replace Dynamic Content
The HTML Parser parses the webpage prior to the JavaScript parser and replaces $ target_url with malicious XSS input:
<Input type = "button" value = "go to" onClick = 'Goto _ url ("foo"); evil_script (""); '>
Dynamic Content is located in JavaScript Code segment
<SCRIPT language = "javascript1.2">
Var msg = '<? = $ Welcome_msg?> ';
Attack XSS Input 1
Hello '; evil_script ();//
Replace Dynamic Content
Replace $ welcome_msg with malicious XSS input:
<SCRIPT language = "javascript1.2">
Var msg = 'hello'; evil_script ();//';
Attack XSS Input 2
Hello </script> <script> evil_script (); </script> <script>
Replace Dynamic Content
Replace $ welcome_msg with malicious XSS input:
<Script> var msg = 'Hello </script>
<Script> evil_script (); </script>
<Script> '//... // do something with msg_text </script>
As shown above, exercise caution when using dynamic content in the JavaScript background. In general, try to avoid or reduce the use of dynamic content in the Javascript background. If dynamic content must be used, possible values of these dynamic content must be taken into account during development or code auditing, whether it will cause XSS attacks.

Create PHP Library Function validation Input

Web developers must understand that it is not enough to use JavaScript Functions on the client to detect and filter illegal input to Build Secure WEB applications. As described above, attackers can easily use tools to bypass JavaScript verification or even SSL encryption to input malicious data. Encoding dynamic content at the output end can only provide a dual protection function. More importantly, the server should verify the input. PHP provides functions such as strpos (), strstr (), and preg_match () to detect invalid characters and strings. The preg_replace () function can be used to replace invalid strings. The owasp php Filters open-source project provides some PHP library functions to filter illegal input for reference. Some common detection and filtering methods include:
  1. Whether the input only contains valid characters;
  2. If the input is a number, whether the number is in the specified range;
  3. Whether the maximum length of the input string is exceeded;
  4. Whether the input meets the special format requirements, such as the email address and IP address;
  5. Logical coupling and restrictions of different input boxes;
  6. Remove spaces at the beginning and end of the input;

The security of Web applications is a very important and widely covered topic. To prevent common XSS attacks, Web developers must understand that they cannot only use JavaScript on the client to detect and filter input; at the same time, a server-side input validation and output encoding library function should also be established; the server side should detect and filter the input; the special characters should be encoded based on the background of the dynamic content, and then transmitted to the browser for display.
  • Refer to "XSS Cheat Sheet" and list some common XSS attack scripts, which can be used as XSS vulnerability test case input.
  • View the "PHP function manual" to learn about the htmlspecialchars () API.
  • Refer to "HTML character entity encoding" to learn about HTML character entity encoding.
  • Refer to the "Open Web Application Security Project" community to learn more about Web security.
  • Access "TamperIE" to obtain TamperIE, which can be used as an XSS vulnerability testing tool.
  • Refer to "XSS Prevention Cheat Sheet" to learn about rules for preventing XSS attacks.
  • Access "owasp php Filters" to obtain open-source PHP library functions used to filter illegal input for reference.
  • View the "Recommended PHP reader list ".
  • Browse all php content on developerWorks.
  • Improve Your PHP skills by checking the PHP project resources on IBM developerWorks.
  • Visit the open source code area on developerWorks for a wide range of how-to information, tools, and project updates to help you develop with open source technology and use it with IBM products.
Author Profile
Zhou ting, a software engineer, is currently engaged in the development of Blade Server Management and firmware at the IBM China Software Development Technology Lab. You can contact her via
Liu Xin is currently engaged in the development of server management firmware at IBM China System Technology Lab.
Liu Jian, a software engineer and Linux enthusiast, is currently engaged in the development of blade server management firmware at the IBM China Software Development Technology Lab. You can contact him through the

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.