Remove html tags from regular expressions in Java and java Regular Expressions

Source: Internet
Author: User

Remove html tags from regular expressions in Java and java Regular Expressions

Regular Expressions in Java remove html tags for more precise display of content, after entering the content in the editor, the style label is also passed into the background and saved to the database. However, when the abstract is displayed, for example, the first 50 words of the text are displayed as the abstract, in this case, all html tags need to be removed and 50 characters are intercepted. Therefore, the following method is implemented through the Java regular expression. The Code is as follows:

Note: This is a Java regular expression to remove html tags. Private static final String regEx_script = "<script [^>] *?> [\ S \ S] *? <\\/ Script> "; // defines the regular expression private static final String regEx_style =" <style [^>] *?> [\ S \ S] *? <\\/ Style> "; // define the regular expression private static final String regEx_html =" <[^>] +> "; // define the regular expression of the HTML Tag private static final String regEx_space = "\ s * | \ t | \ r | \ n "; // define a space and press enter to enter the line break private static final String regEx_w = "<w [^>] *?> [\ S \ S] *? <\\/W [^>] *?> "; // Define all w tags/*** @ param htmlStr * @ return delete Html tags * @ author LongJin */public static String delHTMLTag (String htmlStr) {Pattern p_w = Pattern. compile (regEx_w, Pattern. CASE_INSENSITIVE); Matcher m_w = p_w.matcher (htmlStr); htmlStr = m_w.replaceAll (""); // filter the script tag Pattern p_script = Pattern. compile (regEx_script, Pattern. CASE_INSENSITIVE); Matcher m_script = p_script.matcher (htmlStr); htmlStr = m_script.replaceAll (""); // filter the script tag Pattern p_style = Pattern. compile (regEx_style, Pattern. CASE_INSENSITIVE); Matcher m_style = p_style.matcher (htmlStr); htmlStr = m_style.replaceAll (""); // filter the style label Pattern p_html = Pattern. compile (regEx_html, Pattern. CASE_INSENSITIVE); Matcher m_html = p_html.matcher (htmlStr); htmlStr = m_html.replaceAll (""); // filter the html Tag Pattern p_space = Pattern. compile (regEx_space, Pattern. CASE_INSENSITIVE); Matcher m_space = p_space.matcher (htmlStr); htmlStr = m_space.replaceAll (""); // filter spaces and press ENTER tags htmlStr = htmlStr. replaceAll ("", ""); // filter return htmlStr. trim (); // returns a text string}

Ps: The method is for reference only. You can learn from each other. If you have any questions, please comment.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.