Environment Description
Pick up the site: weather nets, http://lishi.tianqi.com/
Python version: 2.7
Operating system: Windows 7
Packages that depend on (if not installed)
Package Name |
Description |
Official Address |
Bs4 |
is a Python library that can extract data from an HTML or XML file |
https://pypi.python.org/pypi/requests/ |
Requests |
HTTP Library using APACHE2 licensed open source protocol based on Urllib |
https://pypi.python.org/pypi/requests/ |
Xlwt |
The obtained weather data is stored in XLS in this article, and packages that need to be processed by Excel are |
Https://pypi.python.org/pypi/xlwt |
Code Description
#-*-Coding=utf-8-*-from BS4 import beautifulsoup import requests import XLWT import OS #获得某一个月的天气数据 def getlistbyurl (URL): res = requests.get (URL) soup = BeautifulSoup (Res.text, "html.parser") weathers = Soup.select ("#tool_site
") title = Weathers[1].select (" h3 ") [0].text weatherinfors = Weathers[1].select (" ul ") Weatherlist = List ()
For weatherinfor in weatherinfors:singleweather = List () for Li in Weatherinfor.select (' Li '): Singleweather.append (Li.text) weatherlist.append (singleweather) print (title) return Weatherlist,title #@ Par:addressurl get data for a region # @par: Excelsavepath Data Save Address def getlistbyaddress (addressurl,excelsavepath): # URL = "http://l ishi.tianqi.com/beijing/index.html "url = addressurl res = requests.get (URL) soup = BeautifulSoup (Res.text," HT Ml.parser ") dates = Soup.select (". Tqtongji1 ul li a ") workbook = xlwt. Workbook (encoding= ' Utf-8 ') for D in Dates:weatherlist,tItle = Getlistbyurl (d["href"]) Booksheet = Workbook.add_sheet (title,cell_overwrite_ok=true) for I,row in E Numerate (weatherlist): For J,col in Enumerate (row): Booksheet.write (I,j,col) workbook.save (excelsavepath) If __name__ = = "__main__": Addressname = Raw_input ("Enter the city in which the weather will be obtained: \ n") addresses = BeautifulSoup ( Requests.get (' http://lishi.tianqi.com/'). Text, "Html.parser") queryaddress = Addresses.find_all (' A ', text= Addressname) If Len (queryaddress): Savepath = Raw_input ("The City data is detected, enter the path to save the weather data (if not entered, save the default to c:/weather/" +add ressname+ ". xls": \ n ") if not Savepath.strip (): If not os.path.exists (' C:/weather '): OS.
Makedirs (' c:/weather ') Savepath = "c:/weather/" +addressname+ ". xls" for q in Queryaddress: Getlistbyaddress (q["href"],savepath) print ("Already weather data saved to:" +savepath) Else:print ("No data for this city")
This code is edited under Windows, and if you want to run it under Linux, add the #! python environment to your head.
This code function Description: Input: City name, if not, prompt "No data for the city" Save the path, if not entered, the default is saved in the "c:/weather/city name. xls"
Output: Demonstration of the weather data code with the city
This program is developed under the Pycharm IDE, the direct use of pycharm execution, you can use another editor to perform (note, because of coding problems, if the direct use of DOS to do the error)
Import the getweatherbyquery.py file into Pycharm, as shown in the following illustration:
Follow the prompts, enter the city, detect the city, request input Save path
Here we enter the path to save (or direct return), where I enter "D:\weather\bj.xls" (please make sure that the directory exists because the program does not automatically create the directory)
The program then exports the data for each month
After execution, the results are shown below:
The data in Excel is shown in the following illustration: