Python crawls historical weather data

Source: Internet
Author: User
Environment Description

Pick up the site: weather nets, http://lishi.tianqi.com/
Python version: 2.7
Operating system: Windows 7
Packages that depend on (if not installed)

Package Name Description Official Address
Bs4 is a Python library that can extract data from an HTML or XML file https://pypi.python.org/pypi/requests/
Requests HTTP Library using APACHE2 licensed open source protocol based on Urllib https://pypi.python.org/pypi/requests/
Xlwt The obtained weather data is stored in XLS in this article, and packages that need to be processed by Excel are Https://pypi.python.org/pypi/xlwt
Code Description
#-*-Coding=utf-8-*-from BS4 import beautifulsoup import requests import XLWT import OS #获得某一个月的天气数据 def getlistbyurl (URL): res = requests.get (URL) soup = BeautifulSoup (Res.text, "html.parser") weathers = Soup.select ("#tool_site
    ") title = Weathers[1].select (" h3 ") [0].text weatherinfors = Weathers[1].select (" ul ") Weatherlist = List ()
            For weatherinfor in weatherinfors:singleweather = List () for Li in Weatherinfor.select (' Li '): Singleweather.append (Li.text) weatherlist.append (singleweather) print (title) return Weatherlist,title #@ Par:addressurl get data for a region # @par: Excelsavepath Data Save Address def getlistbyaddress (addressurl,excelsavepath): # URL = "http://l ishi.tianqi.com/beijing/index.html "url = addressurl res = requests.get (URL) soup = BeautifulSoup (Res.text," HT Ml.parser ") dates = Soup.select (". Tqtongji1 ul li a ") workbook = xlwt. Workbook (encoding= ' Utf-8 ') for D in Dates:weatherlist,tItle = Getlistbyurl (d["href"]) Booksheet = Workbook.add_sheet (title,cell_overwrite_ok=true) for I,row in E Numerate (weatherlist): For J,col in Enumerate (row): Booksheet.write (I,j,col) workbook.save (excelsavepath) If __name__ = = "__main__": Addressname = Raw_input ("Enter the city in which the weather will be obtained: \ n") addresses = BeautifulSoup ( Requests.get (' http://lishi.tianqi.com/'). Text, "Html.parser") queryaddress = Addresses.find_all (' A ', text= Addressname) If Len (queryaddress): Savepath = Raw_input ("The City data is detected, enter the path to save the weather data (if not entered, save the default to c:/weather/" +add ressname+ ". xls": \ n ") if not Savepath.strip (): If not os.path.exists (' C:/weather '): OS.
            Makedirs (' c:/weather ') Savepath = "c:/weather/" +addressname+ ". xls" for q in Queryaddress: Getlistbyaddress (q["href"],savepath) print ("Already weather data saved to:" +savepath) Else:print ("No data for this city")

This code is edited under Windows, and if you want to run it under Linux, add the #! python environment to your head.
This code function Description: Input: City name, if not, prompt "No data for the city" Save the path, if not entered, the default is saved in the "c:/weather/city name. xls"

Output: Demonstration of the weather data code with the city

This program is developed under the Pycharm IDE, the direct use of pycharm execution, you can use another editor to perform (note, because of coding problems, if the direct use of DOS to do the error)

Import the getweatherbyquery.py file into Pycharm, as shown in the following illustration:

Follow the prompts, enter the city, detect the city, request input Save path
Here we enter the path to save (or direct return), where I enter "D:\weather\bj.xls" (please make sure that the directory exists because the program does not automatically create the directory)
The program then exports the data for each month
After execution, the results are shown below:
The data in Excel is shown in the following illustration:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.