Dom a lightweight parsing implementation. This code does not download any data from the Internet, nor does it execute any scripts, purely parsing.
Parsing is implemented through the MSHTML markup service. To properly use this code, you need to add a mshtml reference.
Because there is no ipersiststreamint interface defined in. NET, you must implement it yourself, interface definition:
The following is program code: [ComVisible (True), comimport (), guid ("7fd52380-4e07-101b-ae2d-08002b2ec713 " ) , interfacetypeattribute ( Cominterfacetype.interfaceisiunknown)] public interface ipersiststreaminit { &NBSP void getclassid ([in, out] ref guid pclassid); [return: marshalas (UNMANAGEDTYPE.I4)] [preservesig] int isdirty (); void load ([In, marshalas (Unmanagedtype.interface)] ucomistream pstm); void save ([In, marshalas (Unmanagedtype.interface)] ucomistream pstm, [in, marshalas (UNMANAGEDTYPE.I4)] int fcleardirty); void GetSizeMax ([Out, marshalas (UnmanagedType.LPArray)] long pcbsize); void InitNew (); } |
|
The following are program code:
Unsafe IHTMLDocument2 Parse (string s) { IHTMLDocument2 pdocument=new Htmldocumentclass (); if (pdocument!=null) { IPersistStreamInit ppersist=pdocument as IPersistStreamInit; Ppersist.initnew (); Ppersist=null; Imarkupservices ms=pdocument as Imarkupservices; if (ms!=null) { Imarkupcontainer Pmc=null; Imarkuppointer Pstart,pend; Ms. Createmarkuppointer (out Pstart); Ms. Createmarkuppointer (out pend); StringBuilder sb=new StringBuilder (s); IntPtr Psource=marshal.stringtohglobaluni (s); Ms. ParseString (ref * (ushort*) Psource.topointer (), 0,out pmc,pstart,pend); if (pmc!=null) { Marshal.release (Psource); return PMC as IHTMLDocument2; } Marshal.release (Psource); } } return null; }
|
|
There was a problem writing the code, Imarkupservice::P arsestring The first argument is ref ushort, which is obviously going to pass in the HTML code, this ushort must be the first Widechar, So here you can bypass compiler warnings by using unsafe code.