JS filters HTML tags and regular

Source: Internet
Author: User
Tags html tags

JS Filter HTML tags and   regular


Although read the regular expression of the content is seen, but after all, or just contact, even the basic concepts are very vague, so had to find the following online code:

function SetContent (str) {
str = str.replace (/</?[ ^>]*>/g, ""); Remove HTML tag
Str.value = str.replace (/[|) *n/g, ' n '); Remove line trailing blanks
str = str.replace (/n[s| |) *r/g, ' n '); Remove extra blank lines
return str;
}
The test found that this code can not filter out the Web page hollow characters (ie:&nbsp;). So I rebuilt it again:

function Removehtmltag (str) {
str = str.replace (/</?[ ^>]*>/g, ""); Remove HTML tag
str = str.replace (/[|) *n/g, ' n '); Remove line trailing blanks
str = str.replace (/n[s| |) *r/g, ' n '); Remove extra blank lines
Str=str.replace (/&nbsp;/ig, ')/Remove &nbsp;
return str;
}
Well, my request was reached.

Now let's explain a little bit about the three regular expressions you've used (which is to say, because you're just touching, maybe my explanation isn't right, for reference only):

First one:/</? [^>]*>/g

In JS, the expression is a "/" Start, followed by the/g, meaning is the global mode, meaning that the matching pattern applied to the entire string, rather than the first match after the stop match.

</? [^>]*> This is explained separately, where the second character "" is a transfer character that is used to transfer the "/" character of the following.? match 0 or 1 of the character just before it. Note: This meta character is not supported by all software. So </? is the "</" format or "<" format that matches the HTML tag.

Again, [^>]*>. [] is the meaning of:

The meaning of ^ is: match the start of a line. For example, regular expression ^when in can match the start of the string "When in the course of human events", but cannot match "What and" in ". This means matching text that starts with "when in".

* Meaning: Match 0 or more of the characters just before it. For example, regular expressions. * means to be able to match any number of any characters

So [^>]* meaning is to match characters outside >. So [^>] can match the pattern as follows:

Div
I need the text </div
P
I need the text </p

* Together with the previous [^>], you can match the following characters:

Div> the words I need </div
P> the words I need </p
BR/
Add the following > to match the characters:

Div> the words I need </div>
P> the words I need </p>
br/>
This completes the matching of a pair of HTML tags. (many words, always think this match a bit??) What do you do? From the flattery mother Buck???

Second:/[|] *n/g: I don't understand either.

The third:/&nbsp;/ig: is the direct lookup &nbsp; character, followed by the/ig meaning is in the global mode for case-insensitive lookup. G represents global, I indicates case-insensitive.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.