The Webbrowse way is not discussed here. Using Indy's idhttp get post directly makes it easy to get web data.
But it's not that easy to crawl a large number of data programs without crashing. These years have also done a lot of similar tools summed up a few good memory than bad writing.
- Memory leaks get page text with HTML parsing specific to the Delphi estimate using Mshtml Htmltotext method, this scheme again large data volume will be memory overflow caused the program crashes, and this is not known to every programmer. Solution: Use your own HTML parsing class here I would like to thank Wu Csdn (called) This class has been very perfect there is no memory leak and there is no parsing of the page.
- Out of memory. Get down the data we generally take tstrings to memory staging, but when the amount of data reached millions of programs will eat all the memory and reported out of a memory solution is simple timing to save as a file.
- The thread pool. Download we all want to be as fast as possible so it's easy to use multithreaded scenarios. Again, I recommend using a thread pool instead of creating a frequent destruction thread.
- Exception handling. Brush Web page data will be a variety of exotic strange data at this time we need to filter writing robust code has ensured that the program does not over.