One, read from the file
Pandas support file types, CSV, general delimited text files, Excel files, JSON, HTML tables, HDF5 and STATA.
1.comma-separated value (CSV) files can be read using Read_csv,
>>> fromPandas import Read_csv>>> Csv_data =read_csv (' ftse_1984_2012.csv ')>>> Csv_data =csv_data.values>>> csv_data[:4]array ([[' -- Geneva- the’,5899.9,5923.8,5880.6,5892.2,801550000L,5892.2],[' -- Geneva- -’,5905.7,5920.6,5877.2,5899.9,832567200L,5899.9],[' -- Geneva- -’,5852.4,5920.1,5852.4,5905.7,643543000L,5905.7],[' -- Geneva-Ten’,5895.5,5895.5,5839.9,5852.4,948790200L,5852.4]], dtype=Object)
2.Excel Files
With the Read_excel function, you need two parameters, a file name, and a sheet name. The first row of data is omitted by default.
Read_excel
exceldate=read_excel (' score.xlsx ',' Sheet1 ');
Exceldate=exceldate. Values
Type (exceldate)
Exceldate. Shape
Exceldate[0,:]
<type ' Numpy.ndarray ' > (4L, 7L)
OUT[6]:
Array ([Dtype=int64, +,--), +, +, +]
3.STATA Files
>>> from pandas import Read_stata
>>> stata_data = read_stata (' Ftse_1984_2012.dta ')
>>> Stata_data = stata_data.values
>>> Stata_data[:4,:2]
Array ([[0.00000000e+00, 4.09540000e+04],
[1.00000000e+00, 4.09530000e+04],
[2.00000000e+00, 4.09520000e+04],
[3.00000000e+00, 4.09490000e+04]])
4, do not use pandas to read the contents of the file
For Excel files using xlrd to read,XLRD, responsible for reading EXCEL,XLWT, is responsible for writing Excel modules.
Import XLRDWB= Xlrd.open_workbook ('score.xlsx'); Sheetnames=wb.sheet_names () sheet= Wb.sheet_by_name (sheetnames[0]) Exceldate=[] forIinchxrange (sheet.nrows): Exceldate.append (Sheet.row_values (i));p rint'%d rows,'%len (Exceldate),'%d columns'%len (exceldate[0])? adate=Np.empty (len (exceldate)) forIinchxrange (len (exceldate)): Adate[i]=exceldate[i][0];p rint adate.shapeprint adate??5Rows7Columns (5L,) [ A. the.Wuyi. $. $.]
Second, save the data
1, numpy proprietary format to save Data Npz,
Savez_compressed will compress when the data is saved.
X=np.arange (Ten) y=np.zeros (( -, -)) np.savez_compressed ('Date1', x, y) Date=np.load ('Date1.npz') Print date['Arr_0']?np.savez_compressed ('Date2', x=x,ontherdate=y) date2=np.load ('Date2.npz');p rint date2['x']? [0 1 2 3 4 5 6 7 8 9][0 1 2 3 4 5 6 7 8 9]
2, Save as CSV file, use Np.savatxt method.
Note: The Read_csv and Read_excel methods in pandas will omit the first line, the default is the title
fromPandas import READ_CSVX=NP.RANDOM.RANDN (Ten,Ten); Np.savetxt ('Date1.csv', x,delimiter=',') Date=read_csv ('Date1.csv') Date=date.values?print x.shapeprint date.shapeprint xprint date[0](10L,10L)(9L,10L)[[ 1.77015084-1.80554159 1.28403537 0.2009891 0.26291606 0.08448012 1.66140115 0.17728159 0.88959083 0.56291309] [ 0.58518743 1.44373927 0.54993558 0.01054313 0.59017053-0.35133822-0.42014888-0.3079049 0.94373013 1.35954942] [-0.54426668 0.04622141-0.66634713 0.45793767-0.63685413 0.99976971-0.39326027-0.93163258-0.79656236 0.72966639] [-0.39963295-1.79753906 0.32433359 0.82947734 1.54987769 2.77115954 0.22080235-0.60776182 2.57004264 0.59011931] [-0.19130441-0.12465107 1.40619987-0.61049826-0.39827838-1.25752483-0.91058091 0.36020845-0.10908816 1.45316786] [ 0.47408008-0.28463786-1.92910625-0.50288128-0.06007105-0.12408027-0.84164768-0.42411635 0.69954835-0.41664136] [ 0.42336169 0.23625584 1.11511232-1.08894244-0.79186067-1.71206423-0.02372556-0.71933255-1.33979181-0.41698675] [-0.06578197 1.04509307 0.1279905 1.03185255 1.15403322-0.18110707-0.60340346-0.33581049 0.02637558-1.06997906] [-1.84514777 1.19496964-1.70550266 1.30863094-1.48711603 1.55044598 0.64066525 0.39086305 0.15076543 1.42276444] [-1.23244051-0.03354092 0.84729912 0.15254869-0.33402971-0.59486921-0.28056973-1.72189462-0.0156615-1.22688771]][ 0.58518743 1.44373927 0.54993558 0.01054313 0.59017053-0.35133822-0.42014888-0.3079049 0.94373013 1.35954942]
Reading notes 4 reading and saving of data