Custom time collection of Sina's "Sina Technology" topic News (prototype)
The specific code is actually not very difficult. Self-feeling, mainly because I don't use regular expressions very much. No, I should say, haha, and I don't know exactly whether preg_match (_ all) how is the array after matching results formed? What is the structure? It is free when the results are motionless. If it is a two-dimensional array, I don't know where the first big array of the two dimensions comes from. I just can collect news for the moment, and I can collect news from any day.
The Code is as follows:
<? PHP
$ Date = '2017-10-06 '; // format: 2009
$ MSG = file_get_contents ('HTTP: // tech.sina.com.cn/tele/oldnews/'.20.date.'.shtml ');
Preg_match_all ('/<a href = (.*?) Target = _ blank> (. *) <// A> <font style = "font-size: 11px" color = # 6666cc> (.*?) <// Font>/', $ MSG, $ ARR );
$ Arr = array_unique ($ ARR );
$ COUNT = count ($ arr [0]);
// Echo $ count;
For ($ I = 0; $ I <$ count; $ I ++ ){
Echo $ arr [0] [$ I]. '<br/> ';
Preg_match ("/<a href = // T (.*?) // $ Date //(.*?) /. Shtml target = _ blank>/", $ arr [0] [$ I], $ URL [$ I]);
$ Content = file_get_contents ('HTTP: // tech.sina.com.cn/t/'.20.date.'/'.w.url## I %2%.'.shtml ');
Preg_match ("/<! -- Publish_helper name = 'original body '(.*?) --> (.*?) <! -- Publish_helper_end -->/is ", $ content, $ cont );
$ Var = $ cont [0];
Echo $ var;
}
?>
My main idea is to first set 'HTTP: // response. (For loop display is used)
It's easy to do. There are still many shortcomings. I hope you can give some advice.