Demo of using the PHP data collection program to collect weather network data

Source: Internet
Author: User


Preface
When we write a Web program, we always want to make our website more beautiful and have more functions, sometimes writing some small tools or adding a small plug-in will make our site more complete. For example, the perpetual calendar function, for example, the weather forecast function we want to talk about now.

Of course, we cannot use professional satellites to accept data, so our weather data comes from the existing weather forecast website. Using the data service provided by the weather forecast website, we can write a PHP crawler and dynamically collect the data we need. In addition, our programs can be updated synchronously when the target site updates the data, automatically obtain data.

The following describes how to compile a simple PHP Data Collection Program (PHP crawler ).

Principle
Given the URL of a webpage, use PHP to download the webpage and obtain the webpage content. Then, extract the data we are interested in using a regular expression and output the data.

In this example, the webpage we want to capture is http: // www. Weather Network com.cn/weather/101050101.shtml. what we are interested in is the weather condition in the next seven days.

Implementation
0. Obtain the URL of the weather forecast webpage:

The code is as follows: Copy code
$ Url = "http: // www. Weather Network com.cn/weather/101050101.shtml ";
$ Page_content = file_get_contents ($ url );

Here, the file_get_contents () function downloads the webpage pointed to by $ url and returns the webpage content as a string. Therefore, the $ page_content variable contains all the HTML code of the webpage to be crawled. Next, we need to extract the data from it.

1. Use regular expressions to match matching strings
Output the value of $ page_content first, and then view the source code of the webpage.

<! -- Day 1 -->
......
<! -- Day 7 -->

These two lines are found in the comment.

Use a regular expression to obtain <! -- Day 1 --> and <! -- Day 7 -->:

The code is as follows: Copy code

Eregi ("<! -- Day 1 --> (. *) <! -- Day 7 --> ", $ page_content, $ res );

2. Complete the path of the image on the page
Because the image paths on the remote webpage are relative paths like/m2/I/icon_weather/29x20/d01.gif, we need to complete these paths by adding http: // www. weather Network com.cn.

The code is as follows: Copy code

$ Forecast = str_replace ("

Now, $ forecast is the weather forecast information we need. This simple PHP crawler is also well written.

Source code
The following is the complete source code for the capture weather forecast Applet. Some code is added to measure the running time of each part of the program, you can set the values of $ start and $ end to control the days of capturing information.

The code is as follows: Copy code


$ Url = "http: // www. Weather Network com.cn/weather/101050101.shtml ";
$ T1 = time ();
   
$ Page_content = file_get_contents ($ url );
$ T2 = time ();

$ Start = 1;
$ End = 3;

If ($ end> 7 ){
Echo "exceeds the forecast capability range. Please reset it! ";
} Else {
Echo "future". ($ end-$ start). "weather forecast for Harbin ("
. Date ('Y-m-J'). "published )";

Eregi ("-- day $ start -- (. *) -- day $ end --", $ page_content, $ res );

$ Forecast = str_replace (""$ T3 = time ();

Echo $ forecast;

Echo 'first step costs '. ($ t2-$ t1). 'Ms .';
Echo 'Last step costs '. ($ t3-$ t2). 'Ms .';
}

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.