There are a lot of open-source web crawlers, and there will be a lot of crawlers on SourceForge, but few have C. Today we recommend two web crawlers developed by C #.
Http://www.codeproject.com/KB/IP/Crawler.aspx written by foreigners, HTTP Communication Using socket, the effect is good, but no processing of Chinese, Chinese download will appear garbled, in the socket accept part of the process to do it. ThisProgramRelatively complete. A basic crawler has all the functions, which is a good example. In vs2003 and. NET 1.1, some of the statements are outdated and need to be adjusted.
Http://www.jeffheaton.com/source is also an old, csspider.zip. I did not study it carefully and followed the lgpl protocol. This comrade specializes in crawler research. He has written a lot of books, but he does not understand English .. NET 2.0.
The two examples described here are complete examples, including webpage download, analysis, multithreading, and output. A little bit of the following processing can achieve good results. At the same time, you can also study the Implementation ideas to help you do crawlers.