Java uses regular expressions to filter tags in html

Source: Internet
Author: User
Tags getmessage html tags

1 /**2 * Remove HTML tags from text3      *4      * @paraminputstring5      * @return6      */7      Public Staticstring Html2text (String inputstring) {8         if(Stringutils.isempty (inputstring)) {9             return NULL;Ten         } OneString Htmlstr =inputstring; AString textstr = ""; - Java.util.regex.Pattern P_script; - Java.util.regex.Matcher M_script; the Java.util.regex.Pattern P_style; - Java.util.regex.Matcher M_style; - Java.util.regex.Pattern p_html; - Java.util.regex.Matcher m_html; +  - Java.util.regex.Pattern p_html1; + Java.util.regex.Matcher m_html1; A  at         Try { -String regex_script = "<[\\s]*?script[^>]*?>[\\s\\S]*?<[\\s]*?\\/[\\s]*?script[\\s]*?>";//define a regular expression for script {or <script[^>]*?>[\\s\\S]*?<\\/script> -             // } -String Regex_style = "<[\\s]*?style[^>]*?>[\\s\\S]*?<[\\s]*?\\/[\\s]*?style[\\s]*?>";//a regular expression that defines a style {or <style[^>]*?>[\\s\\S]*?<\\/style> -             // } -String regex_html = "<[^>]+>";//Regular expressions that define HTML tags inString REGEX_HTML1 = "<[^>]+"; -P_script =pattern.compile (Regex_script, to pattern.case_insensitive); +M_script =P_script.matcher (HTMLSTR); -Htmlstr = M_script.replaceall ("");//Filter Script Tags the  *P_style =Pattern $ . Compile (Regex_style, pattern.case_insensitive);Panax NotoginsengM_style =P_style.matcher (HTMLSTR); -Htmlstr = M_style.replaceall ("");//Filter Style Labels the  +p_html =pattern.compile (regex_html, pattern.case_insensitive); Am_html =P_html.matcher (HTMLSTR); theHtmlstr = M_html.replaceall ("");//Filter HTML Tags +  -P_HTML1 =Pattern $ . Compile (REGEX_HTML1, pattern.case_insensitive); $M_HTML1 =P_html1.matcher (HTMLSTR); -Htmlstr = M_html1.replaceall ("");//Filter HTML Tags -  theTextstr =Htmlstr; - Wuyi             //Replacement &amp;nbsp; theTextstr = Textstr.replaceall ("&amp;", ""). ReplaceAll ("nbsp;", ""); -  Wu}Catch(Exception e) { -System.err.println ("Html2text:" +e.getmessage ()); About         } $  -         returnTEXTSTR;//returns a text string -}

  

  

/** * Remove HTML tags in text * * @param inputstring * @return * */public static string Html2text (String InputS        Tring) {if (Stringutils.isempty (inputstring)) {return null;        } String htmlstr = inputstring;        String textstr = "";        Java.util.regex.Pattern P_script;        Java.util.regex.Matcher M_script;        Java.util.regex.Pattern P_style;        Java.util.regex.Matcher M_style;        Java.util.regex.Pattern p_html; Java.util.regex.Matcher m_html;
Java.util.regex.Pattern P_HTML1; Java.util.regex.Matcher M_HTML1;
        Try {            String regex_script = "&LT;[\\S]*?SCRIPT[^&GT;] *?>[\\s\\s]*?<[\\s]*?\\/[\\s]*?script[\\s]*?> "; Define a regular expression for script {or <script[^>]*?>[\\s\\S]*?<\\/script>           //}             String Regex_style = "<[\\s]*?style[^>]*?>[\\s\\s]*?<[\\s]*?\\/[\ \s]*?style[\\s]*?> "; Regular expressions that define a style {or <style[^>]*?>[\\s\\S]*?<\\/style>           //}            String regex_html = "<[^>]+>"; Regular expressions for defining HTML tags             String REGEX_HTML1 = "<[^>]+";            P_script = Pattern.compile (regex_script,                  &NBS P pattern.case_insensitive);            M_script = P_script.matcher (htmlstr);    & nbSp       HTMLSTR = M_script.replaceall (""); Filter script Tags
P_style = Pattern. Compile (Regex_style, pattern.case_insensitive);            M_style = P_style.matcher (HTMLSTR); Htmlstr = M_style.replaceall (""); Filter style Labels
p_html = Pattern.compile (regex_html, pattern.case_insensitive);            m_html = P_html.matcher (HTMLSTR); Htmlstr = M_html.replaceall (""); Filter HTML Tags
P_HTML1 = Pattern. Compile (REGEX_HTML1, pattern.case_insensitive);            M_HTML1 = P_html1.matcher (HTMLSTR); Htmlstr = M_html1.replaceall (""); Filter HTML Tags
Textstr = Htmlstr;
Replacement &amp;nbsp; Textstr = Textstr.replaceall ("&amp;", ""). ReplaceAll ("nbsp;", "");
} catch (Exception e) {System.err.println ("Html2text:" + e.getmessage ()); }
Return textstr;//returns a text string}

Java uses regular expressions to filter tags in html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.