Advanced 16th Course Python Module pandas

Source: Internet
Author: User
Tags arithmetic diff

Turn

The same lesson is reproduced from the great God. The sample code will be incrementally added in the future.

Pandas

Pandas is a numpy-based tool that was created to solve the data analysis task. Pandas incorporates a number of libraries and a number of standard data models, providing the tools needed to efficiently manipulate large datasets. Pandas provides a number of functions and methods that enable us to process data quickly and easily.

>>> from pandas import Series, DataFrame

>>> Import Pandas as PD

A.pandas
Function Description

Pd.isnull (series)

Pd.notnull (series)

Determines whether it is empty (NaN)

Determines whether it is not empty (not NaN)

2.2.a.1 Pandas common functions

The B.series Series combines the advantages of dictionaries and ndarray with almost all of the index operations and functions of ndarray or dictionaries.
Property Description
Values Get array
Index Get index
Name Values ' name
Index.name The name of the index

2.2.b.1 Series Common Properties

Function Description
Series ([x, y,...]) Series ({' A ': x, ' B ': Y,...}, index=param1) Generate a series
Series.copy () Copy a series

Series.reindex ([x, y,...], Fill_value=nan)

Series.reindex ([x, y,...], Method=nan)

Series.reindex (Columns=[x,y,...])

Return to a new object that adapts to the new index, populating the missing values with Fill_value

Returns a new object that adapts to the new index, filled in as method

To re-index a column

Series.drop (Index) Discard the specified item
Series.map (f) Apply element-level functions
Sort function Description
Series.sort_index (Ascending=true) Returns a sorted new object based on the index
Series.order (Ascending=true) Returns a sorted object by value, Nan value at the end
Series.rank (method= ' average ', ascending=true, axis=0) Assign an average rank to each group

Df.argmax ()

Df.argmin ()

Returns the index position that contains the maximum value

Returns the index position with the minimum value

2.2.b.2 Series Common functions

Method options for Reindex:

Ffill, Bfill forward padding/back padding

Pad, backfill forward, backward handling

Rank's method option

' Average ' in equal groupings, assigning average rankings for each value

' Max ', ' min ' uses the smallest rank in the entire grouping

' First ' is ranked by value in the order in which it appears in the original data c.dataframe

Dataframe is a tabular data structure that contains a set of ordered columns, each of which can be a different value type (numeric, String, Boolean, and so on). Dataframe has both a row index and a column index, which can be seen as a Dictionary of series (the same index is shared).

Dataframe can get a column as a series by way of a dictionary or. ColumnName. Rows can also be obtained by location or by name.

Assigning a value to a column that does not exist creates a new column.

>>> del frame[' xxx '] # Delete Column

Property Description
Values The value of the Dataframe
Index Row index
Index.name The name of the row index
Columns Column index
Columns.name Name of the column index
Ix Returns the Dataframe of a row
Ix[[x,y,...], [x, y,...]] Re-index rows and re-index columns
T Frame row and column transpose

2.2.c.1 Dataframe Common Properties

Function Description

DataFrame (Dict, Columns=dict.index, Index=[dict.columnnum])

DataFrame (two-dimensional ndarray)

DataFrame (a dictionary consisting of arrays, lists, or tuples)

DataFrame (structured/recorded array of numpy)

DataFrame (a dictionary made up of series)

DataFrame (dictionary made up of dictionaries)

DataFrame (List of dictionaries or series)

DataFrame (a list of lists or tuples)

DataFrame (DataFrame)

DataFrame (NumPy's Maskedarray)

Build Dataframe

Data matrix, you can also pass in row and column labels

Each sequence becomes a column of Dataframe. All sequences must be of the same length

Similar to "dictionaries made up of arrays"

Each series becomes a column. If an index is not explicitly indexed, the index of each series is merged into the row index of the result

Each inner-layer dictionary becomes a column. The key is merged into the row index of the result.

The items will become a dataframe line. Indexed and assembled to be the dataframe of the column.

Similar to two-dimensional ndarray

Follow dataframe

Similar to two-dimensional ndarray, but mask results become na/missing values

Df.reindex ([x, y,...], Fill_value=nan, limit)

Df.reindex ([x, y,...], Method=nan)

Df.reindex ([x, y,...], columns=[x,y,...],copy=true)

Returns a new object that adapts to the new index, fills the missing value to Fill_value, and the maximum padding is limit

Returns a new object that adapts to the new index, filled in as method

The rows and columns are also re-indexed, and the new objects are copied by default.

Df.drop (index, axis=0) Discards the specified item on the specified axis.
Sort function Description

Df.sort_index (axis=0, Ascending=true)

Df.sort_index (by=[a,b,...])

Sort by index
Summary statistics function Description
Df.count () Number of non-Nan
Df.describe () Generate multiple summary statistics at once

Df.min ()

Df.min ()

Minimum value

Maximum Value

Df.idxmax (axis=0, Skipna=true)

Df.idxmin (axis=0, Skipna=true)

Returns the series of index with the maximum value

Returns the series with the lowest value index

Df.quantile (axis=0) Calculate the number of sub-digits of a sample

Df.sum (axis=0, Skipna=true, Level=nan)

Df.mean (axis=0, Skipna=true, Level=nan)

Df.median (axis=0, Skipna=true, Level=nan)

Df.mad (axis=0, Skipna=true, Level=nan)

Df.var (axis=0, Skipna=true, Level=nan)

DF.STD (axis=0, Skipna=true, Level=nan)

Df.skew (axis=0, Skipna=true, Level=nan)

Df.kurt (axis=0, Skipna=true, Level=nan)

Df.cumsum (axis=0, Skipna=true, Level=nan)

Df.cummin (axis=0, Skipna=true, Level=nan)

Df.cummax (axis=0, Skipna=true, Level=nan)

Df.cumprod (axis=0, Skipna=true, Level=nan)

Df.diff (axis=0)

Df.pct_change (axis=0)

Returns a series containing a sum subtotal

Returns a series that contains an average

Returns a series containing the median number of arithmetic

Returns a series that calculates the average absolute deviation based on the mean

Returns a series of variance

Returns a series of standard deviations

Returns the skewness of the sample value (third-order distance)

Returns the kurtosis of the sample value (four-step distance)

Returns the cumulative sum of the samples

Returns the cumulative maximum value of a sample

Returns the cumulative minimum value of a sample

Returns the cumulative product of a sample

Returns the first-order difference of a sample

Change in percent number of returned samples

Calculation function Description

Df.add (DF2, Fill_value=nan, Axist=1)

Df.sub (DF2, Fill_value=nan, Axist=1)

Df.div (DF2, Fill_value=nan, Axist=1)

Df.mul (DF2, Fill_value=nan, Axist=1)

Element-level addition, no element is found for playhead default Fill_value

Element-level subtraction, no element is found for playhead by default Fill_value

Element-level division, the Playhead element is not found by default Fill_value

Element-level multiplication, no element is found for playhead by default Fill_value

Df.apply (f, axis=0) Apply the F function to a one-dimensional array formed by the columns of each row
Df.applymap (f) Apply an F function to individual elements
Df.cumsum (axis=0, Skipna=true) Accumulate, return the accumulated dataframe

2.2.C.2 Dataframe Common functions

Index mode Description
Df[val] Select a single column or set of columns for Dataframe
Df.ix[val] Select a single row or group of rows for Dataframe
Df.ix[:,val] Select a single column or subset of columns
DF.IX[VAL1,VAL2] Match one or more axes to a new index
Reindex method Match one or more axes to a new index
xs method Select a single line or single column based on the tag to return a series
Icol, IRow method Selects a single or single row based on an integer position and returns a series
Get_value, Set_value Select a single value based on row labels and column labels

2.2.c.3 Dataframe Common Index method

Operation:

By default, the arithmetic operations between Dataframe and series match the index of the series to the Dataframe column, which propagates down the column. If the index is not found, it will be re-indexed to produce the set.

D.index

The Index object of the pandas is responsible for managing axis labels and other metadata (such as axis names, etc.). When you build a series or dataframe, the tags of any array or other sequence that you use are converted to an index. the Index object cannot be modified to be securely shared among multiple data structures.


The primary Index object Description
Index The most extensive Index object that represents an axis label as a numpy array of Python objects
Int64index Special index for integers
Multiindex A hierarchical Index object that represents a multi-level index on a single axis. Can be seen as an array of tuples
Datetimeindex Memory nanosecond timestamp (denoted by NumPy's Datetime64 type)
Periodindex Special index for period data (time interval)

2.2.d.1 Primary Index Property

function Description /strong>
Index ([x, y,...]) CREATE INDEX
Append (index) joins another Index object, resulting in a new index
diff (Index) calculates the difference set, resulting in a new Index
intersection (Index) calculation intersection
Union (INDEX) compute Unions
Isin (Index) checks for presence with the parameter index, returns an array of type bool
Delete (i) Delete element at index I, get new index
Drop (str) to delete the incoming value, get the new index
Insert (I,STR) Inserts an element at index i to get a new index
is_monotonic () returns True when each element is greater than the previous element
is_unique () returns True when index has no duplicate values
Unique () computes an array of unique values in index

2.2.d.2 commonly used index functions

Advanced 16th Course Python Module pandas

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.