In addition to the series, dataframe these two commonly used data structures in the Pandas library, there is also a panel data structure that can typically be created with a dictionary of Dataframe objects or a three-dimensional array to create a Panel object.
1 # 2 3 4 5 @author: Jeremy 6 7 import NumPy as NP 8 import Series, DataFrame, panel 9 import pandas as PD
1 #Create a Panel object with a dictionary that contains dataframe2DF = np.random.binomial (100, 0.95, (9,2))3DM = np.random.binomial (100, 0.95, (12,2))4DFF = DataFrame (df, columns = ['Physics','Math'])5DFM = DataFrame (DM, columns = ['Physics','Math'])6Score_panel = Panel ({'Girls':d FF,'Boys':D FM})
Use the print () method to view the created Panel object score_panel information: The Panel object has item axis, Major_axis axis, Minor_axis axis three axes, The dimension and data size information of three axes are given: 2*12*2.
Print (Score_panel)<class'pandas.core.panel.Panel'>2 ( Items) x (major_axis) x 2minor_axis axis:physics to Math
The Panel object index is similar to a two-dimensional array and a data frame, and the default first axis is item axis, which can be indexed directly:
1 # extraction of girls group students ' physical and mathematical achievements 2 score_panel['Girls']
Physics Math0 961 98 982 973 974 95 975 956 957 968 969 nan nan10 nan nanone-to-one Nan nan
The IX-based label index is extended to three dimensions, so we extract the data we want on three dimensions, for example:
1 " " 2 find out the physical and mathematical results of girls with a mathematical score of not less than 93 3 and returns a data frame Dataframe 4 " " 5 score_panel.ix['Girls', Score_panel. Girls.math >= 93,:]
Physics Math0 961 98 982 973 974 975 956 957 968 96
Below we introduce a Panel object method by creating a Panel object that contains stock price data for different periods of multiple stocks.
1 ImportPandas.io.data as Web2pdata = Panel (dict (symbol, web. DataReader (symbol, Data_source ='Yahoo',3Start ='1/1/2009', end ='6/1/2012')) forSymbolinch['AAPL','GOOG','MSFT','DELL']))4 Print(pdata)
<class'pandas.core.panel.Panel'>4 (items) x 868 (Major_axis) x 6 2009-01-02 00:00:00 to 2012-06-01 00:00:00minor_axis axis:open to Adj Close
1 # extract market data for ' 2012-06-01 ' four stocks 2 ' 2012-06-01 ',:]
AAPL DELL GOOG msftopen 5.691600e+02 12.15000 571.790972 28.760000high 5.726500e+02 12.30 572.650996 28.959999low 5.605200e+02 12.04500 568. 350996 28.440001close 5.609900e+02 12.07000 570.981000 28.450001volume 1.302469e+08 19397600.00000 6138700.000000 56634300.000000
adj Close 7.421812e+01 11.67592 285.205295 25.598227
1 # Extract closing price (close) data for four stocks ' 2012-05-30 ' to ' 2012-06-01 ' 2 pdata.ix[:,'2012-05-30':'2012-06-01' Close']
AAPL DELL GOOG msftdate 2012-05-30 579.169998 12.56 588.230992 29.3400002012-05-31 577.730019 12.33 580.860990 29.1900012012-06-01 560.989983 12.07 570.981000 28.450001
1 # extract market data for four stocks ' 2012-05-30 ' to ' 2012-06-01 ' 2 ' 2012-05-30 ':'2012-06-01',:]
<class'pandas.core.panel.Panel'>4 (items) x 3 (Major_ AXIS) x 62012-05-30 00:00:00 to 2012-06-01 00:00:00minor_axis axis:open to Adj Close
The result of the return here is different from the above because the result of our return is still a panel object, the market data extracted only includes the closing price, the third dimension (axis) disappears, the result is a two-dimensional data frame Dataframe object, If we want to see the entire complete data message like Dataframe, you can use the to_frame () method, which renders the panel data in a "stacked " dataframe form:
1 # presenting panel data in dataframe form 2 ' 2012-05-30 ':'2012-06-01',:].to_frame ()
AAPL DELL GOOG Date Minor2012-05-30 Open 5.692000e+02 12.59000 588.161028 High5.799900e+02 12.70000 591.901014 Low5.665600e+02 12.46000 583.530999Close5.791700e+02 12.56000 588.230992Volume1.323574e+08 19787800.00000 3827600.000000ADJ Close7.662330e+01 12.14992 293.821674 2012-05-31 Open 5.807400e+02 12.53000 588.720982 High5.815000e+02 12.54000 590.001032 Low5.714600e+02 12.33000 579.001013Close5.777300e+02 12.33000 580.860990Volume1.229186e+08 19955600.00000 5958800.000000ADJ Close7.643280e+01 11.92743 290.140354 2012-06-01 Open 5.691600e+02 12.15000 571.790972 High5.726500e+02 12.30000 572.650996 Low5.605200e+02 12.04500 568.350996Close5.609900e+02 12.07000 570.981000Volume1.302469e+08 19397600.00000 6138700.000000ADJ Close7.421812e+01 11.67592 285.205295MSFT Date Minor2012-05-30 Open 29.350000 High29.480000 Low29.120001Close29.340000Volume41585500.000000ADJ Close26.399015 2012-05-31 Open 29.299999 High29.420000 Low28.940001Close29.190001Volume39134000.000000ADJ Close26.264051 2012-06-01 Open 28.760000 High28.959999 Low28.440001Close28.450001Volume56634300.000000ADJ Close25.598227
Dataframe has a corresponding To_panel () method, which is the inverse of to_frame ():
1 ' 2012-05-30 ':'2012-06-01',:].to_frame ()2 stacked.to_panel ()
<class'pandas.core.panel.Panel'>4 (items) x 3 (Major_ AXIS) x 62012-05-30 00:00:00 to 2012-06-01 00:00:00minor_axis axis:open to Adj Close
Summary: One of the benefits of Panel objects is that we can create a panel object to hold multi-level/multidimensional data, and when we need data from any dimension for modeling analysis, we can extract a series or dataframe at any time.
Resources:
"Data analysis using Python" Wes McKinney
Computational Statistics in Python:http://people.duke.edu/~ccc14/sta-663/index.html
Panel (faceplate) data structure