Using Markupservice to implement HTML parsing to Domtree in C #

Source: Internet
Author: User
Dom a lightweight parsing implementation. This code does not download any data from the Internet, nor does it execute any scripts, purely parsing.
Parsing is implemented through the MSHTML markup service. To properly use this code, you need to add a mshtml reference.
Because there is no ipersiststreamint interface defined in. NET, you must implement it yourself, interface definition:
The following is program code:

[ComVisible (True),  comimport (),  guid ("7fd52380-4e07-101b-ae2d-08002b2ec713 "  )  , interfacetypeattribute ( Cominterfacetype.interfaceisiunknown)] 
public interface ipersiststreaminit  
{ &NBSP
 void getclassid ([in, out] ref guid pclassid);
 [return: marshalas (UNMANAGEDTYPE.I4)] [preservesig] 
 int isdirty (); 
 void load ([In, marshalas (Unmanagedtype.interface)] ucomistream pstm);  
 void save ([In, marshalas (Unmanagedtype.interface)] ucomistream pstm,  
  [in, marshalas (UNMANAGEDTYPE.I4)] int fcleardirty);  
 void  GetSizeMax ([Out, marshalas (UnmanagedType.LPArray)] long pcbsize);  
 void  InitNew ();  



The following are program code:

Unsafe IHTMLDocument2 Parse (string s)
{
IHTMLDocument2 pdocument=new Htmldocumentclass ();
if (pdocument!=null)
{
IPersistStreamInit ppersist=pdocument as IPersistStreamInit;
Ppersist.initnew ();
Ppersist=null;
Imarkupservices ms=pdocument as Imarkupservices;
if (ms!=null)
{
Imarkupcontainer Pmc=null;
Imarkuppointer Pstart,pend;
Ms. Createmarkuppointer (out Pstart);
Ms. Createmarkuppointer (out pend);
StringBuilder sb=new StringBuilder (s);
IntPtr Psource=marshal.stringtohglobaluni (s);
Ms. ParseString (ref * (ushort*) Psource.topointer (), 0,out pmc,pstart,pend);
if (pmc!=null)
{
Marshal.release (Psource);
return PMC as IHTMLDocument2;
}
Marshal.release (Psource);
}
}
return null;
}

There was a problem writing the code, Imarkupservice::P arsestring The first argument is ref ushort, which is obviously going to pass in the HTML code, this ushort must be the first Widechar, So here you can bypass compiler warnings by using unsafe code.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.