"Hadoop authoritative Guide" weather data can be downloaded in the FTP://FTP3.NCDC.NOAA.GOV/PUB/DATA/NOAA, the Internet to see this data good happy, open ftp found a problem, ah ah, so many documents AH, I went to point to save as, Have to point to when Ah, Thunder should have bulk download, but I did not find, it is estimated that my browser to the Thunderbolt banned, simply use Python to write a realization download well, the internet early, found very simple AH
Copy Code code as follows:
#!/usr/bin/python
#-*-Coding:utf-8-*-
From Ftplib import FTP
Def ftpconnect ():
Ftp_server = ' ftp3.ncdc.noaa.gov '
Username = ' '
Password = ' '
Ftp=ftp ()
Ftp.set_debuglevel (2) #打开调试级别2, displaying more information
Ftp.connect (ftp_server,21) #连接
Ftp.login (Username,password) #登录, if anonymous login is replaced with empty string
return FTP
Def downloadfile ():
FTP = Ftpconnect ()
#print ftp.getwelcome () #显示ftp服务器欢迎信息
DataPath = "/pub/data/noaa/"
year=1911
While year<=1930:
PATH=DATAPATH+STR (year)
Li = ftp.nlst (path)
For Eachfile in Li:
Localpaths = Eachfile.split ("/")
LocalPath = Localpaths[len (localpaths)-1]
Localpath= ' weatherdata/' +str (year) + '--' +localpath# put the date on the front, easy to sort
BufSize = 1024 #设置缓冲块大小
fp = open (LocalPath, ' WB ') #以写模式在本地打开文件
Ftp.retrbinary (' RETR ' + eachfile,fp.write,bufsize) #接收服务器上文件并写入本地文件
Year=year+1
Ftp.set_debuglevel (0) #关闭调试
Fp.close ()
Ftp.quit () #退出ftp服务器
If __name__== "__main__":
DownloadFile ()