A friend asked me to help me develop a program for collecting news information a few days ago. I took some time to write a PHP version and recorded it as needed. A friend asked me to help me develop a program for collecting news information a few days ago. I took some time to write a PHP version and recorded it as needed.
When it comes to collection, it is nothing more than obtaining information remotely-> extracting the required content-> classifying storage-> reading-> displaying
It is also an enhanced version of simple "thief Program ".
The following is the corresponding core code (don't take it as a bad thing. ^_^)
The content to be collected is an announcement on a game website, such:
Sort the basic information and collect the information into the database:
/IUs "; // regular preg_match_all ($ pattern, $ conn, $ arr); // match the content to the arr array // print_r ($ arr); die; foreach ($ arr [1] as $ key => $ value) {// The id of the two-dimensional array [2] is exactly the same as that of [1, starting with key $ url = "http://www.93moli.com /". $ arr [2] [$ key]; $ SQL = "insert into list (title, url) value ('$ value',' $ URL ')"; mysql_query ($ SQL); // echo "http://www.93moli.com/#url'> $ value "."
";}$ Id ++; echo" is collecting URL data list $ id... please wait... "; echo" script window. location = 'list. php? Id = $ ID' script ";} else {echo" data collection ends. ";}?>
Conn. php is the database connection file
List. php is the current page
Because the data to be collected is displayed by page and the page address is regularly increasing, I used js jump code to control the number of pages to be collected by using id transfer, this also avoids the large number of for loops.
You can easily import data into the database. in the next article, you can write the process of collecting specific url information.