Pandas Sorting and statistics

Source: Internet
Author: User

"Python for Data analysis" sort sort_index ()

To sort rows or column indexes

In [1]: Import pandas as PD

in [2]: From pandas import Dataframe, Series in

[3]: obj = Series (range (4), index=[' d ' , ' A ', ' B ', ' C '] in

[4]: obj
out[4]:
d    0
a    1
b    2
c    3
Dtype:int64 In

[5]: Obj.sort_index ()
OUT[5]:
a    1
b    2
c    3
d    0
Dtype:int64 in

[6]: Import NumPy as NP In

[8]: frame = Dataframe (Np.arange (8). Reshape ((2,4)), index=[' three ', ' one '],
   ...:                   columns=[' d ', ' a ' , ' B ', ' C '] in

[9]: Frame
out[9]:
       d  a  b  c
three  0  1  2  3
one    4  5  6  7 in

[ten]: Frame.sort_index ()
out[10]:
       d  a  b  c
one    4  5  6  7
three  0 1 2 3 in  [one

]: Frame.sort_index (axis =1)
out[11]:
       a  b  c  D
three  1  2  3  0
one    5  6  7  4

in [[]: Frame.sort_index (Axis=1, Ascending=false)
out[12]:
       d  c  B  A
three  0  3  2  1
one    4  7  6 5
sort_values

The series are sorted by value , and any missing values are placed at the end of the series by default.

in [[]: obj = Series ([4, Np.nan, 6, Np.nan, -3, 2]) in

[[]: obj
out[19]:
0    4.0
1    nan
2    6.0
3    NaN
4   -3.0
5    2.0
dtype:float64 in

[O]: obj.sort_values ()
out [+]:
4   -3.0
5    2.0
0    4.0
2    6.0
1    nan
3    nan
Dtype:float64

On Dataframe, sorts by the values in one or more columns. You can achieve this by passing the name of one or more columns to the By option:

in [[]: Frame.sort_values (by= ' B ')
out[16]:
       d  a  b  c
three  0  1  2  3
One    4  5  6  7
Summary and Statistics

Sum, mean, max

options for reduction methods

Options Description
Axis Reduction of the axes. Dataframe row with 0, column with 1
Skipna Exclude missing values, the default value is True
Level If the axis is a hierarchical index (MILTIINDEX), group the reduction by level.
Indirect Statistics

Idxmin, Idxmax: An index that reaches the minimum or maximum value. Cumulative Type

Cumsum Summary statistics for a column

Df.describe: The numeric and non-numeric types are different. correlation coefficients and covariance

Corr (): Coefficient of correlation

CoV (): Covariance unique value, value count, and membership

Unique: You can get an array of unique values in series.

Isin: For determining the membership of a vector collection

Value_counts: Used to calculate the probability of each value appearing in a series.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.