Regular expression matching HTML filtering illegal characters
Match an HTML tag that matches the table as follows:
[ss]*
Or
[ss]*?
Above two expressions, one adds "?" And one without "?", then what's the difference?
We know "?" In a regular expression is a wildcard character: matches the preceding subexpression 0 or one time, or indicates a non-greedy qualifier.
Here, passing the test, we come to the conclusion that we do not add "?" , in the case of matching the following paragraph of content:
This is the first table
I'm not the content in the table.
This is the second table.
I'm not the content in the table.
This is a third table.
$str =preg_replace ("/s+/", "", $str); Filter Excess Return
$str =preg_replace ("/<[]+/si", "<", $str); Filter <__ (with spaces behind the "<" sign)
$str =preg_replace ("/ /si", "", $str);//Comments
$str =preg_replace ("/< (!. *?) >/si "," ", $str); Filter DOCTYPE
$str =preg_replace ("/< (/?html.*?) >/si "," ", $str); Filter HTML Tags
$str =preg_replace ("/< (/?head.*?) >/si "," ", $str); Filter head Tags
$str =preg_replace ("/< (/?meta.*?) >/si "," ", $str); Filter META Tags
$str =preg_replace ("/< (/?body.*?) >/si "," ", $str); Filter Body Tags
$str =preg_replace ("/< (/?link.*?) >/si "," ", $str); Filter Link Tags
$str =preg_replace ("/< (/?form.*?) >/si "," ", $str); Filter form Labels
$str =preg_replace ("/cookie/si", "Cookie", $str); Filter Cookie Tags
$str =preg_replace ("/< (applet.*?) > (. *?) < (/applet.*?) >/si "," ", $str); Filter applet Tags
$str =preg_replace ("/< (/?applet.*?) >/si "," ", $str); Filter applet Tags
$str =preg_replace ("/< (style.*?) > (. *?) < (/style.*?) >/si "," ", $str); Filter style Labels
$str =preg_replace ("/< (/?style.*?) >/si "," ", $str); Filter style Labels
$str =preg_replace ("/< (title.*?) > (. *?) < (/title.*?) >/si "," ", $str); Filter title Tags
$str =preg_replace ("/< (/?title.*?) >/si "," ", $str); Filter title Tags
$str =preg_replace ("/< (object.*?) > (. *?) < (/object.*?) >/si "," ", $str); Filter Object Tags
$str =preg_replace ("/< (/?objec.*?) >/si "," ", $str); Filter Object Tags
$str =preg_replace ("/< (noframes.*?) > (. *?) < (/noframes.*?) >/si "," ", $str); Filter Noframes Labels
$str =preg_replace ("/< (/?noframes.*?) >/si "," ", $str); Filter Noframes Labels
$str =preg_replace ("/< (i?frame.*?) > (. *?) < (/i?frame.*?) >/si "," ", $str); Filter Frame Tags
$str =preg_replace ("/< (/?i?frame.*?) >/si "," ", $str); Filter Frame Tags
$str =preg_replace ("/< (script.*?) > (. *?) < (/script.*?) >/si "," ", $str); Filter script Tags
$str =preg_replace ("/< (/?script.*?) >/si "," ", $str); Filter script Tags
$str =preg_replace ("/web Effects/si", "JavaScript", $STR); Filter script Tags
$str =preg_replace ("/vbscript/si", "VBScript", $STR); Filter script Tags
$str =preg_replace ("/on ([a-z]+) S*=/si", "on1=", $str); Filter script Tags
$str =preg_replace ("/&#/si", "a", $str); Filter script tags, such as Javascript:alert (' Aabb)
?>
http://www.bkjia.com/PHPjc/445418.html www.bkjia.com true http://www.bkjia.com/PHPjc/445418.html techarticle Regular expression matching HTML filter illegal characters match an HTML tag, matching table is as follows: [ss]* or [ss]*? above two expressions, one plus and one without adding, then what does this ...