Pandas Sorting and statistics

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

"Python for Data analysis" sort sort_index ()

To sort rows or column indexes

In [1]: Import pandas as PD

in [2]: From pandas import Dataframe, Series in

[3]: obj = Series (range (4), index=[' d ' , ' A ', ' B ', ' C '] in

[4]: obj
out[4]:
d    0
a    1
b    2
c    3
Dtype:int64 In

[5]: Obj.sort_index ()
OUT[5]:
a    1
b    2
c    3
d    0
Dtype:int64 in

[6]: Import NumPy as NP In

[8]: frame = Dataframe (Np.arange (8). Reshape ((2,4)), index=[' three ', ' one '],
   ...:                   columns=[' d ', ' a ' , ' B ', ' C '] in

[9]: Frame
out[9]:
       d  a  b  c
three  0  1  2  3
one    4  5  6  7 in

[ten]: Frame.sort_index ()
out[10]:
       d  a  b  c
one    4  5  6  7
three  0 1 2 3 in  [one

]: Frame.sort_index (axis =1)
out[11]:
       a  b  c  D
three  1  2  3  0
one    5  6  7  4

in [[]: Frame.sort_index (Axis=1, Ascending=false)
out[12]:
       d  c  B  A
three  0  3  2  1
one    4  7  6 5

sort_values

The series are sorted by value , and any missing values are placed at the end of the series by default.

in [[]: obj = Series ([4, Np.nan, 6, Np.nan, -3, 2]) in

[[]: obj
out[19]:
0    4.0
1    nan
2    6.0
3    NaN
4   -3.0
5    2.0
dtype:float64 in

[O]: obj.sort_values ()
out [+]:
4   -3.0
5    2.0
0    4.0
2    6.0
1    nan
3    nan
Dtype:float64

On Dataframe, sorts by the values in one or more columns. You can achieve this by passing the name of one or more columns to the By option:

in [[]: Frame.sort_values (by= ' B ')
out[16]:
       d  a  b  c
three  0  1  2  3
One    4  5  6  7

Summary and Statistics

Sum, mean, max

options for reduction methods

Options	Description
Axis	Reduction of the axes. Dataframe row with 0, column with 1
Skipna	Exclude missing values, the default value is True
Level	If the axis is a hierarchical index (MILTIINDEX), group the reduction by level.

Indirect Statistics

Idxmin, Idxmax: An index that reaches the minimum or maximum value. Cumulative Type

Cumsum Summary statistics for a column

Df.describe: The numeric and non-numeric types are different. correlation coefficients and covariance

Corr (): Coefficient of correlation

CoV (): Covariance unique value, value count, and membership

Unique: You can get an array of unique values in series.

Isin: For determining the membership of a vector collection

Value_counts: Used to calculate the probability of each value appearing in a series.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pandas Sorting and statistics

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support