The first big mistake is the failure to release the unmanaged resources in time, causing the program to run long after the OutOfMemoryException is thrown.
This small demo The primary unmanaged resource is the Httpwebresopne and stream of HTTP requests, and the other is rediscline. Cause this problem arises not I don't know to release unmanaged resources, but the code is negligent. This writing code habit should be a long time, because the previous program did not run for a long time, the problem is not exposed
At first, it was written like this.
using New StreamReader (Stream, Encoding.UTF8)) { return reader. ReadToEnd (); }
is executed within the using statement so that the object is not released and returns
Improved:
string string . Empty; using (Stream stream = response. GetResponseStream ())// raw { usingnew StreamReader ( Stream, Encoding.UTF8)) { = reader. ReadToEnd (); } } ... .. return Source;
Two recommended methods for treating unmanaged resources
The second big error, will be asynchronous equivalent to multithreading
Async is not equal to multithreading
For I/O intensive you should use Async for CPU-intensive multithreading. Capturing network resources in crawlers is I/O intensive, while HTML parsing is CPU intensive. Since the crawler has to get resources to parse, I don't use asynchronous
- Using statement, execution out of the using will be released
- Skillfully use finallly statement try{}catch{} Do not forget that finally the call to Dispose method is displayed in finally
Actually, they compiled the same result.
The first pit I stepped on HttpWebRequest the default connection, regardless of how many threads are open or the speed
Need to add in app. Config
<system.net> <connectionManagement> <add address="*" maxconnection="100000"></add> </connectionManagement> </ System.net>
Defects:
Many tables in SQL Server are 1-to-many relationships due to insufficient data specification resulting in a one-to-many data redundancy
Data display
Use Echart, about Echart please see my blog http://www.cnblogs.com/zuin/p/6122818.html
Male/female ratio
The amount of attention is distributed
10 schools with the largest number of alumni
10 employees with the largest number of companies
Top 10 Majors
or Top10 with the highest number of likes
Crawl the sum of millions of user-aware information