Because I sent XMLHTTP articles that batch crawled remote data some time ago
Http://blog.csdn.net/babyt/archive/2004/09/08/98516.aspx
Recently, someone asked me how to save the article in text format, instead of simply using HTML storage, which will save the database space. So I wrote such a function to remove the HTML tag from the text.
Although the function is simple, it is very effective for processing HTML documents.
<%
Function RemoveHTML (strHTML)
Dim objRegExp, Match, Matches
Set objRegExp = New Regexp
ObjRegExp. IgnoreCase = True
ObjRegExp. Global = True
'Take the closed <>
ObjRegExp. Pattern = "<. +?> "
'For matching
Set Matches = objRegExp. Execute (strHTML)
'Traverse the matching set and replace the matched items.
For Each Match in Matches
StrHtml = Replace (strHTML, Match. Value ,"")
Next
RemoveHTML = strHTML
Set objRegExp = Nothing
End Function
%>