JS Filter HTML tags and regular
Although read the regular expression of the content is seen, but after all, or just contact, even the basic concepts are very vague, so had to find the following online code:
function SetContent (str) {
str = str.replace (/</?[ ^>]*>/g, ""); Remove HTML tag
Str.value = str.replace (/[|) *n/g, ' n '); Remove line trailing blanks
str = str.replace (/n[s| |) *r/g, ' n '); Remove extra blank lines
return str;
}
The test found that this code can not filter out the Web page hollow characters (ie: ). So I rebuilt it again:
function Removehtmltag (str) {
str = str.replace (/</?[ ^>]*>/g, ""); Remove HTML tag
str = str.replace (/[|) *n/g, ' n '); Remove line trailing blanks
str = str.replace (/n[s| |) *r/g, ' n '); Remove extra blank lines
Str=str.replace (/ /ig, ')/Remove
return str;
}
Well, my request was reached.
Now let's explain a little bit about the three regular expressions you've used (which is to say, because you're just touching, maybe my explanation isn't right, for reference only):
First one:/</? [^>]*>/g
In JS, the expression is a "/" Start, followed by the/g, meaning is the global mode, meaning that the matching pattern applied to the entire string, rather than the first match after the stop match.
</? [^>]*> This is explained separately, where the second character "" is a transfer character that is used to transfer the "/" character of the following.? match 0 or 1 of the character just before it. Note: This meta character is not supported by all software. So </? is the "</" format or "<" format that matches the HTML tag.
Again, [^>]*>. [] is the meaning of:
The meaning of ^ is: match the start of a line. For example, regular expression ^when in can match the start of the string "When in the course of human events", but cannot match "What and" in ". This means matching text that starts with "when in".
* Meaning: Match 0 or more of the characters just before it. For example, regular expressions. * means to be able to match any number of any characters
So [^>]* meaning is to match characters outside >. So [^>] can match the pattern as follows:
Div
I need the text </div
P
I need the text </p
* Together with the previous [^>], you can match the following characters:
Div> the words I need </div
P> the words I need </p
BR/
Add the following > to match the characters:
Div> the words I need </div>
P> the words I need </p>
br/>
This completes the matching of a pair of HTML tags. (many words, always think this match a bit??) What do you do? From the flattery mother Buck???
Second:/[|] *n/g: I don't understand either.
The third:/ /ig: is the direct lookup character, followed by the/ig meaning is in the global mode for case-insensitive lookup. G represents global, I indicates case-insensitive.