Give hadoop authoritative guide -- NCDC1929-2011 data
Ftp://ftp.ncdc.noaa.gov/pub/data/gsod/
Command:
The data are available:
1) www -- http://www.ncdc.noaa.gov/cgi-bin/res40.pl? Pagew.gsod.html
2) FTP -- ftp://ftp.ncdc.noaa.gov/pub/data/gsod via browser
3) command line ftp:
A) Enter: Open ftp.ncdc.noaa.gov
B) login is: ftp
C) password is: your email address
D) To move to the correct subdirectory, enter:
CD/pub/data/gsod
The files have ded in this subdirectory are:
Data files --
Annual files:
Eg, gsod_2006.tar-all 2006 files (Compressed) by station, in one tar file.
Etc, etc-for each annual volume.
Note: Each year's data are contained in subdirectories/folders by year.
Station files:
Eg, 010010-999-2006.op.gz-files by station year, identified by WMO number,
Wban number (if appropriate), and year. For a cross reference of
Filenames with location, see:
Ish-history.txt
Informational/utility files --
Country-list.txt-a list showing the station number range
Each country.
Ish-history.txt -- a station list to be used with the data files,
Showing the names and locations for each station.
Note: Global Summary of day contains a subset of
Stations listed in this station history.
Readme.txt-A description of the data and its format.
E) to get a copy of the data description, enter:
Get readme.txt destination (destination is your
Output location and name)... e.g .--
Get readme.txt C: readme.txt-copies to hard drive C:
F) then, to get a copy of any of the other files, use
The same procedure, such --
Get gsod_2006.tar C: data.txt
G) to logoff the system when finished, enter:
Bye