1 #Pivot Tables Pivot Table2 #pd.pivot_table (Data,values=none,index=none,columns=none,3 ImportNumPy as NP4 ImportPandas as PD aggfunc='mean', fill_value=none,margins=false,dropna=true,margins_name=' All')5Date = ['2017-5-1','2017-5-2','2017-5-3']*36RNG =pd.to_datetime (date)7DF = PD. DataFrame ({'Date': RNG,8 'Key': List ('ABCDABCDA'),9 'Values': Np.random.rand (9) *10})Ten Print(DF) One Print('-----') A - Print(Pd.pivot_table (df,values ='Values', index = ['Date'],columns='Key', Aggfunc=np.sum))#you can also aggfunc= ' sum ' - Print('-----') the #Data:dataframe Object - #values: The list of columns or columns to aggregate - #Index: Index of the pivot, filtered from the column of the original data - #columns: Columns of a PivotTable report, filtering from the columns of the original data + #Aggfunc: function for aggregation, default is Numpy,mean, support NumPy calculation method - Print(Pd.pivot_table (df,values ='Values', index = ['Date','Key'],aggfunc=len)) + Print('------') A #in this case, we'll do the date,key together, with values of value: Count of values in different (Date,key) cases at #Aggfunc=len (or count): Count
Results:
Date Key values
0 2017-05-01 a 2.562157
1 2017-05-02 b 9.604823
2 2017-05-03 C 4.770968
3 2017-05-01 D 0.654878
4 2017-05-02 a 8.839281
5 2017-05-03 B 1.211138
6 2017-05-01 C 9.570886
7 2017-05-02 D 9.915021
8 2017-05-03 a 8.551166
-----
Key A B c D
Date
2017-05-01 2.562157 NaN 9.570886 0.654878
2017-05-02 8.839281 9.604823 NaN 9.915021
2017-05-03 8.551166 1.211138 4.770968 NaN
-----
Values
Date Key
2017-05-01 a 1.0
C 1.0
D 1.0
2017-05-02 a 1.0
b 1.0
D 1.0
2017-05-03 a 1.0
b 1.0
C 1.0
------
1 #Cross table: Crosstab2 #By default, crosstab calculates the frequency of a factor, such as a pivot analysis for STR3 #Pd.crosstab (Index,columns,values=none,rownames=none4 #, Colnames=none,aggfunc=none,margins=false,dropna=true,normalize=false)5DF = PD. DataFrame ({'A': [1,2,2,2,2],6 'B': [3,3,4,4,4],7 'C': [1,1,np.nan,1,1]})8 Print(DF)9 Print('------')Ten Print(Pd.crosstab (df['A'],df['B'])) One Print('------') A #if crosstab only receives two series, he will provide a frequency table - #with the unique value of A, the number of occurrences (a, b) of the unique value of statistics = (1,3) c appears 1 times (A, B) = (2,4) appears 3 times - the Print(Pd.crosstab (df['A'],df['B'],normalize=true))#display in a frequency-based manner - Print('--------') - Print(Pd.crosstab (df['A'],df['B'],values=df['C'],aggfunc=np.sum))#values: A value array based on a factor aggregation - #Aggfunc: If the values array is not passed, the frequency table is computed, and if the array is passed, the calculation is calculated as specified + #this is equivalent to a and B defined groupings, and the value of the third Series C in each group is calculated - Print('--------') + Print(Pd.crosstab (df['A'],df['B'],values=df['C'],aggfunc=np.sum,margins=True)) A Print('--------') at #Margins: Boolean value, default value false, add row/column margin (subtotal)
Result:
a b C
0 1 3 1.0
1 2 3 1.0
2 2 4 NaN
3 2 4 1.0
4 2 4 1.0
------
b 3 4a
1 1 0
2 1 3
------
b 3 4
a
1 0.2 0.0
2 0.2 0.6
--------
b 3 4
a
1 1.0 NaN
2 1.0 2.0
--------
b 3 4 All
a
1 1.0 nan 1.0
2 1.0 2.0 3.0
all 2.0 2.0 4.0
--------
2018.03.29 python-pandas pivot table/crosstab crosstab