If we encounter HTML processing in the project, if it is a. NET programmer, we strongly recommend using NSoup, otherwise it would be too painful to intercept the string. NSoup is an open-source framework and is a. NET porting version of JSoup. The usage is basically the same! NSoup click to download
Obtain the html code of the webpage. process the webpage html [csharp] view plaincopyprint?
NSoup. Nodes. Document doc = NSoup. NSoupClient. Connect ("http://blog.csdn.net/dingxiaowie2013"). Get ();
Or is the custom html used to generate the html page [csharp] view plaincopyprint?
NSoup. Nodes. Document doc = NSoup. NSoupClient. Parse (HtmlString );
But unfortunately NSoup is the default UTF-8, there will be garbled Chinese processing (for encoding is UTF-8 will naturally be normal, but some are GB2312 may be garbled)
Solution to NSoup parsing HTML garbled characters1. Download the source code of the webpage and reprocess it.
[Csharp] view plaincopyprint?
// Download the webpage source code
WebClient webClient = new WebClient ();
String htmlString = Encoding. GetEncoding ("UTF-8"). GetString (webClient. DownloadData ("http://www.baidu.com "));
NSoup. Nodes. Document doc = NSoup. NSoupClient. Parse (htmlString );
2. Get the webpage stream
[Csharp] view plaincopyprint?
// Obtain the webpage stream
WebRequest webRequest = WebRequest. Create ("http://blog.csdn.net/dingxiaowei2013 ");
NSoup. Nodes. Document doc1 = NSoup. NSoupClient. Parse (webRequest. GetResponse (). GetResponseStream (), "UTF-8 ");
The source code is the same as that of Baidu.
================================== Publisher Ding Xiaowei CSDN blog column ================ =====
MyBlog: http://blog.csdn.net/dingxiaowei2013 MyQQ: 1213250243
Unity QQ group: 375151422,858550, 6348968 cocos2dx QQ group: 280818155
==================================== Mutual learning, common progress ==============================