C # Use A applet to extract search results from Baidu

Source: Internet
Author: User

Baidu does not use xhtml, which makes the original XML function of. NET not so easy to use.

(And who will really like DOM? It's so tiring to use !)

 

However, Baidu's page is very irregular, so it is imperative to use a large number of hard encoding.

Therefore, this program has made many assumptions about Baidu's page design, and cannot adapt to the future changes of Baidu's page structure.

Fortunately, such a small program is easy to write, so it's okay to change it.

 

In addition, this program uses a large number of regular expressions, which may make it less efficient to integrate the results of various search engines.

 

If you need to display several search engine results on a page at the same time, I suggest using the iframe tag, Or, that is, let the background send the webpage to the foreground through ajax, and then generate the page using js on the foreground.

Pay special attention to the use of url encoding in FCL in the program. Therefore, you must add additional references to the System. Web assembly.

 

 

Code -- Baidu Robot
 1 using System;
2 using System. Collections. Generic;
3 using System. Text;
4 using System. Text. RegularExpressions;
5 using System. Web;
6 using System. Net;
7 using System. IO;
8 namespace baiduRobotStrim
9 {
10 struct BaiduEntry
11 {
12 public string title, brief, link;
13}
14 class Program
15 {
16 static string GetHtml (string keyword)
17 {
18 string url = @ "http://www.baidu.com /";
19 string encodedKeyword = HttpUtility. UrlEncode (keyword, Encoding. GetEncoding (936 ));
20 // Baidu uses the codepage 936 character encoding as the query string, so it really focuses on Chinese search ......
21 // not to mention, I also like Microsoft
22 // Google can correctly identify both UTF-8 encoding and codepage, but its own web pages indicate UTF-8 in the HTTP Header
23 // It is estimated that Google does not hate Microsoft either (and Microsoft's proprietary specifications)
24 string query = "s? Wd = "+ encodedKeyword;
25
26 HttpWebRequest req;
27 HttpWebResponse response;
28 Stream stream;
29 req = (HttpWebRequest) WebRequest. Create (url + query );
30 response = (HttpWebResponse) req. GetResponse ();
31 stream = response. GetResponseStream ();
32 int count = 0
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.