[Reading notes] Python data Analysis (v) Pandas getting Started

Source: Internet
Author: User
Tags arithmetic instance method

Pandas: data Analysis Library built on NumPy

PANDAS data structure: Series, DataFrame

Series: class one-dimensional array objects with data labels (also considered as dictionaries)

Values, index

Missing data detection: Pd.isnull (), Pd.notnull (), instance method for series objects

The series object itself and its index have a Name property, which is closely related to pandas other key functions

DataFrame: Tabular data structures, columns and rows are indexed

Get dataframe column: How to tag a dictionary, or how to attribute it (frame2[' state ']/frame2.state)

Get Dataframe Line: IX () method

Columns returned by index are just the corresponding data views, not replicas, and the copy method of the series can be displayed to copy columns

Dataframe's index and column also have the Name property, which can be set by itself

Indexed objects:Pandas the Index object is responsible for managing axis labels and other metadata, and when building a series or dataframe, any array or other sequence of tags used will be converted to an index. The Index object is not modifiable (immutable).

Index Property

Basic functions

re-index: Create an object suitable for the new index Reindex ()

Specify Drop object: Drop ()

Index selection and Filtering: IX ()

Arithmetic operations and data alignment

Pandas can perform arithmetic operations on different indexed objects and automatically populate Na with non-overlapping values

padding values in arithmetic methods:fill_value

operations between Dataframe and series:broadcast ()

By default, the arithmetic operations between Dataframe and series match the index of the series to the Dataframe column and then propagate down the line, and if you want to match rows and broadcast on a column, you must use the arithmetic operation method

function Application and Mapping

NumPy Ufuncs (Element progression group method), which can also be used to manipulate pandas objects

The Apply () method of Dataframe, which can apply a function to a one-dimensional array formed by a row or column

Sort and rank

Sort:

Sort_index () sort the index of the row or column (in dictionary order)

Sort_index (by =) sort by values in one or more columns

The series is sorted by value, and the order method

Ranking:

Rank ()

Axis index with duplicate values

The Is_unique () property of the index can tell you if its value is unique

Summary and calculation of descriptive statistics

SUM ()

Mean ()

Describe ()

Describing and summarizing statistical functions

correlation coefficients and covariance

The series and Dataframe methods are computed for the parameter pairs.

Unique value, value count, and membership

Unique value: Unique () method

Value count: The Value_counts () method calculates how often each value in a series appears

Membership: Isin, which is used to determine the membership of a vectorization set, you can select a subset of the data in a series or dataframe column

Processing missing data

Filtering Missing data: Dropna

For Dataframe objects, Dropna discards any rows that contain missing values by default; Dropna (how = ' all ') discards all the rows that are NA.

If it is for a column, passing in axis = 1 will

Fill missing data: Fillna

Incoming constant value: All Na is replaced with a constant value

Incoming dictionaries: Different columns are populated with different values

New objects are returned by default, but can also be modified in place inplace = TRUE

Hierarchical indexes: data reshaping and grouping-based operations (pivot tables)

Stack and Unstack

For Dataframe, each axis can have a hierarchical index.

Summarize by Level: The description and summary statistics for the Dataframe and series are all with A Levels option.

Use column as row index to change row index to dataframe column: Set_index () opposite Reset_index ()

[Reading notes] Python data Analysis (v) Pandas getting Started

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.