Php filters HTML tags, attributes, and other regular expressions. _ php instance

Source: Internet
Author: User
This article mainly introduces php regular expressions for filtering HTML tags and attributes. This article uses code examples to provide regular expressions for filtering HTML content. For more information, see comments in the code, this article has a great effect on friends who use PHP to collect data. For more information, see
$ Str = preg_replace ("/\ s +/", "", $ str); // filter excess carriage returns $ str = preg_replace ("/<[] +/si ", "<", $ str); // filter <__( "<" followed by a space) $ str = preg_replace ("/<\! --.*? -->/Si "," ", $ str); // comment $ str = preg_replace ("/<(\!. *?)> /Si "," ", $ str); // filter DOCTYPE $ str = preg_replace ("/<(\/? Html. *?)> /Si "," ", $ str); // filter html tags $ str = preg_replace ("/<(\/? Head. *?)> /Si "," ", $ str); // filter head tags $ str = preg_replace ("/<(\/? Meta. *?)> /Si "," ", $ str); // filter meta Tags $ str = preg_replace ("/<(\/? Body. *?)> /Si "," ", $ str); // filter the body tag $ str = preg_replace ("/<(\/? Link. *?)> /Si "," ", $ str); // filter link tags $ str = preg_replace ("/<(\/? Form. *?)> /Si "," ", $ str); // filter form tags $ str = preg_replace ("/cookie/si "," COOKIE ", $ str ); // filter COOKIE tags $ str = preg_replace ("/<(applet. *?)> (.*?) <(\/Applet. *?)> /Si "," ", $ str); // filter the applet tag $ str = preg_replace ("/<(\/? Applet. *?)> /Si "," ", $ str); // filter the applet tag $ str = preg_replace ("/<(style. *?)> (.*?) <(\/Style. *?)> /Si "," ", $ str); // filter the style tag $ str = preg_replace ("/<(\/? Style. *?)> /Si "," ", $ str); // filter the style tag $ str = preg_replace ("/<(title. *?)> (.*?) <(\/Title. *?)> /Si "," ", $ str); // filter the title tag $ str = preg_replace ("/<(\/? Title. *?)> /Si "," ", $ str); // filter the title tag $ str = preg_replace ("/<(object. *?)> (.*?) <(\/Object. *?)> /Si "," ", $ str); // filter the object tag $ str = preg_replace ("/<(\/? Objec. *?)> /Si "," ", $ str); // filter the object tag $ str = preg_replace ("/<(noframes. *?)> (.*?) <(\/Noframes. *?)> /Si "," ", $ str); // filter noframes tags $ str = preg_replace ("/<(\/? Noframes. *?)> /Si "," ", $ str); // filter noframes tags $ str = preg_replace ("/<(I? Frame. *?)> (.*?) <(\/I? Frame. *?)> /Si "," ", $ str); // filter the frame tag $ str = preg_replace ("/<(\/? I? Frame. *?)> /Si "," ", $ str); // filter the frame tag $ str = preg_replace ("/<(script. *?)> (.*?) <(\/Script. *?)> /Si "," ", $ str); // filter the script tag $ str = preg_replace ("/<(\/? Script. *?)> /Si "," ", $ str); // filter the script tag $ str = preg_replace ("/javascript/si "," Javascript ", $ str ); // filter the script tag $ str = preg_replace ("/vbscript/si", "Vbscript", $ str ); // filter the script tag $ str = preg_replace ("/on ([a-z] +) \ s * =/si", "On \ 1 = ", $ str); // filter script tags $ str = preg_replace ("// & #/si", "& #", $ str); // filter script tags, such as javAsCript: alert (

Clear spaces and line feed

function DeleteHtml($str){$str = trim($str);$str = strip_tags($str,"");$str = ereg_replace("\t","",$str);$str = ereg_replace("\r\n","",$str);$str = ereg_replace("\r","",$str);$str = ereg_replace("\n","",$str);$str = ereg_replace(" "," ",$str);return trim($str);}

Filter HTML attributes

1. filter regular expressions of all html tags:

The code is as follows:


] +>

// Regular expression used to filter attributes of all html tags:

$ Html = preg_replace ("/<([a-zA-Z] +) [^>] *>/", "<\ 1>", $ html );


3. filter out regular expressions of some html tags (for example, exclude

, That is, do not filter

):

The code is as follows:


] +>


4. the enumeration expression used to filter some html tags (for example, to filter

):

The code is as follows:


] *>


5. exclude the regular expression for filtering the attributes of some html tags (for example, exclude the alt attribute, that is, do not filter the alt attribute ):

The code is as follows:


\ S (?! Alt) [a-zA-Z] + = [^ \ s] *


6. the regular expression for filtering the attributes of some html tags (such as the alt attribute ):

The code is as follows:


(\ S) alt = [^ \ s] *

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.