I'm going to take notes from the back.
The Nineth chapter data aggregation and grouping operation grouping
#generate data, five rows of four columnsDF = PD. DataFrame ({'Key1':['a','a','b','b','a'], 'Key2':[' One',' Both',' One',' Both',' One'], 'data1': Np.random.randn (5), 'data2': Np.random.randn (5)}) DF
# the average value of data1 can be calculated according to the Key1 grouping df.loc[:,'data1'].groupby (df.loc[:,'key1 ']). Mean ()
# the average value of data1 can be calculated according to the Key1,key2 grouping df.loc[:,'data1'].groupby ([df.loc[:,' Key1 '],df.loc[:,'key2']]). Mean ()
# after two keys are grouped, you can unstacktemp = df.loc[:,'data1'].groupby ([df.loc[:,'key1 '],df.loc[:,'key2']]). Mean () Temp.unstack ()
Note: This group ignores null values, and in addition GroupBy () can choose axis = 0 or 1 in parentheses, which means to group by navigation or column, and if df[' column name '].groupby () is the only group, or all data groupby, GroupBy () can also pass in functions such as Len
Polymerization
Df.groupby ('key1'). STD () # also has count (), sum (), mean (), median () Std,var, Min,max,prod,first,last
#可以自定义函数
Df.groupby (' Key1 '). Agg ([Lambda X:x.max ()-x.min (), NP.MEAN,NP.STD])
# You can customize the function df.groupby ('key1'). Agg ([' Custom Function ', Lambda X:x.max ()-x.min ()), (' mean ', Np.mean), (' standard deviation ') , NP.STD)])
#不同列做不同的动作, one takes the maximum value, one takes the minimum value
Df.groupby (' Key1 '). Agg ({' data1 ': Np.max, ' data2 ': np.min})
Df.groupby (' Key1 '). Agg ({' data1 ': [Np.max,np.size,np.mean], ' data2 ': np.min}) #这个超级吊
Pivot Tables and Cross tables
Tips.pivot_table (index='sex', columns='time', values=' total_bill', aggfunc=np.sum,margins=true,fill_value=0)
Look at this thing to understand, index represents the row, columns represents the column, values represents the value, and then Aggfunc represents the sum, or mean,margins represents whether to display a summary, Fill_value fill missing values
Python Data Analysis-nineth chapter data aggregation and grouping operations