Python Data Analysis-nineth chapter data aggregation and grouping operations

Source: Internet
Author: User

I'm going to take notes from the back.

The Nineth chapter data aggregation and grouping operation grouping
#generate data, five rows of four columnsDF = PD. DataFrame ({'Key1':['a','a','b','b','a'],                  'Key2':[' One',' Both',' One',' Both',' One'],                  'data1': Np.random.randn (5),                  'data2': Np.random.randn (5)}) DF

# the average value of data1 can be calculated according to the Key1 grouping df.loc[:,'data1'].groupby (df.loc[:,'key1  ']). Mean ()

# the average value of data1 can be calculated according to the Key1,key2 grouping df.loc[:,'data1'].groupby ([df.loc[:,' Key1 '],df.loc[:,'key2']]). Mean ()

# after two keys are grouped, you can unstacktemp = df.loc[:,'data1'].groupby ([df.loc[:,'key1  '],df.loc[:,'key2']]). Mean () Temp.unstack ()

Note: This group ignores null values, and in addition GroupBy () can choose axis = 0 or 1 in parentheses, which means to group by navigation or column, and if df[' column name '].groupby () is the only group, or all data groupby, GroupBy () can also pass in functions such as Len

Polymerization


Df.groupby ('key1'). STD () # also has count (), sum (), mean (), median () Std,var, Min,max,prod,first,last

#可以自定义函数
Df.groupby (' Key1 '). Agg ([Lambda X:x.max ()-x.min (), NP.MEAN,NP.STD])

# You can customize the function df.groupby ('key1'). Agg ([' Custom Function ',  Lambda X:x.max ()-x.min ()), (' mean ', Np.mean), (' standard deviation ') , NP.STD)])

#不同列做不同的动作, one takes the maximum value, one takes the minimum value
Df.groupby (' Key1 '). Agg ({' data1 ': Np.max, ' data2 ': np.min})
Df.groupby (' Key1 '). Agg ({' data1 ': [Np.max,np.size,np.mean], ' data2 ': np.min}) #这个超级吊

Pivot Tables and Cross tables
Tips.pivot_table (index='sex', columns='time', values='  total_bill', aggfunc=np.sum,margins=true,fill_value=0)

Look at this thing to understand, index represents the row, columns represents the column, values represents the value, and then Aggfunc represents the sum, or mean,margins represents whether to display a summary, Fill_value fill missing values

Python Data Analysis-nineth chapter data aggregation and grouping operations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.