The code about the array loop is a bit inconvenient to post out, but I hope my friends can give me a idea, here first thank you $ _ array_article = array ("http://blog.csdn.net/anewczs/article/details/6617391 "); // $ _ array_article [] = "http: // B's question about array loops
The code is a bit inconvenient to post, but I hope my friends can give me a thought. thank you first.
$ _ Array_article = array ("http://blog.csdn.net/anewczs/article/details/6617391 ");
// $ _ Array_article [] = "http://blog.csdn.net/tianlesoftware/article/details/6723117 ";
Foreach ($ _ array_article as $ value ){
$ Spider-> begin_url = $ value;
File_get_contents ($ spider-> begin_url );
_ Spider ($ spider-> fetch_turl ($ spider-> begin_url ));
}
Here is a part of the code. each link is processed through an array composed of links. However, if the array element is greater than one, an error occurs, I feel that after a loop is executed, some values in the memory affect the second loop, which leads to an error, how can I add new elements to the two global arrays I need and clear all other memory values?
------ Solution --------------------
It is easy to get stuck in an endless loop.
This is generally the case for crawling.
#1. create a file to save the url
#2. append the captured url to the file
#3. Read the url in the file, capture data in one row, and repeat #2, #3
There are some problems, such as how to avoid two crawlers with the same link and how to limit the target to a domain name .. And so on. I believe you can solve these small problems.