In the previous article, the PHP-based simple collection Data Warehouse program mentioned the list data of the collection news page. Next we will talk about the specific content of the collection news.
In the previous article, the PHP-based simple collection Data Warehouse program mentioned the list data of the collection news page. Next we will talk about the specific content of the collection news.
In the previous article, we have collected the list data on the news page. The next step is to read the URL to be collected from the database and capture the page.
Create a content table
However, you must note that you cannot use the incremental method of id collection URL, because IDs in the data table may be intermittent, such as id = 9, id = 11, when id = 10 is collected, the URL is blank, which may result in empty fields being collected.
One technique used here is the database query statement. When we collect the first piece of data, we can determine whether there is an id number greater than this id in the database. If so, read one, the query information already exists.
The Code is as follows:
<? Phpinclude_once ("conn. php "); $ id = (int) $ _ GET ['id']; $ SQL =" select * from list where id = $ id "; $ result = mysql_query ($ SQL); $ row = mysql_fetch_array ($ result); // obtain the corresponding url address $ content = file_get_contents ($ row ['url']); $ pattern = "/
(. *) <\/Dd>/iUs "; preg_match ($ pattern, $ content, $ info); // obtain the information to store info echo $ title = $ row [1]."
"; Echo $ content = $ info [0]." "; // Insert database $ add =" insert into content (title, content) value ('$ title',' $ content') "; mysql_query ($ add ); $ sql2 = "select * from list where id> $ id order by id asc limit 1"; $ result2 = mysql_query ($ sql2); $ row2 = mysql_fetch_array ($ result2 ); // obtain the corresponding url address if ($ row2 ['id']) {echo "script window. location = 'content. php? Id = $ row2 [0] 'script ";}?>
In this way, the news content we want will be collected into the database. Next we only need to sort out some data styles.