[2] news and news
Baidu API: Channel News API _ yiyuan
The second reason is that Baidu APIStore ranks second in free and comprehensive sorting, which is also because the company had to collect news and did not do well on its own. Let's take a look at how much news is available:
Today, we watched a total of 44 channels, with 20 news records on the first page of each channel. Very powerful. The code is simple:
1
As mentioned in the previous article, we used file_get_contents () to retrieve the content and preg_match the regular expression. The regular expression is also in the new stage.
Let's talk about the idea of news collection at that time, that is, find the news list page, for example, People's Network-Beijing-districts and counties, analyze all the links, and collect news titles and content from the links, image. Database storage: 'tag' table Storage tag, field name, link address, list start and end tag, news title start and end tag, and content start and end tag. You can directly select a name for future collection and then query it. The target website must be updated after revision.
Difficulties encountered: 1. encoding, GB2312 to UTF-8, began to use iconv () later use mb_detect_encoding (). 2. Is the regular, get pictures and so on. Debug page for getting news
The code is too messy to be ugly.