& Nbsp; exceptions often occur in the HTML format during character truncation. ASP is, PHP is, too. if it is a foreseeable simple HTML format, use replace, the article body may contain all HTML formats. to be efficient, use the following: Tested & #36; search & nbsp; array & nbsp; ('& lt; script [^ & gt;] *? & Gt ;.*? & Lt; when intercepting characters, it is often caused by exceptions in the HTML format. ASP is, PHP is also, if it is a foreseeable simple HTML format, replace should be used, the article body may contain all HTML formats. to be efficient, use the following:
$ Search = array ("' ] *?>. *? Script 'Si ", // remove javascript
"'<[/!] *? [^ <>] *?> 'Si ", // remove the HTML tag
"'([Rn]) [s] +'", // remove the white space
"'& (Quot | #34);' I", // replaces the HTML object
"'& (Amp | #38);' I ",
"'& (Lt | #60);' I ",
"'& (Gt | #62);' I ",
"'& (Nbsp | #160);' I ",
"'& (Iexcl | #161);' I ",
"'& (Cent | #162);' I ",
"'& (Pound | #163);' I ",
"'& (Copy | #169);' I ",
"'& # (D +); 'e"); // run as PHP code
$ Replace = array ("",
"",
"\ 1 ",
"\"",
"&",
"<",
"> ",
"",
Chr (1, 161 ),
Chr (1, 162 ),
Chr (1, 163 ),
Chr (1, 169 ),
"Chr (\ 1 )");
// $ Document is the string to be processed. if the source is a file, you CAN $ document = file_get_contents ($ filename );
// $ Text = preg_replace ($ search, $ replace, $ document );