Pandas series DataFrame row and column data filtering, pandasdataframe
I. Cognition of DataFrame
DataFrame is essentially a row (index) column index + multiple columns of data.
To simplify our understanding, let's change our thinking...
In reality, to simplify the description of a thing, We will select several features.
For example, to portray a person from the perspectives of gender, height, education, occupation, hobbies, etc., these "Angles" are "Features ".
Different rows represent different records, and columns represent features. Different records vary with different features.
The default index of DataFrame is no. (0, 1, 2 ...), It can be understood as a location index. Generally, we use id to identify different records without changing the index. However, to understand the meaning of different features (columns), we often re-specify the column.
Some simple but not rigorous understandings are:
Columns
Row-index-record (the default index is generally used)
Column-feature (custom index)
Index
Default index-serial number-position-Easy to index but difficult to understand
Custom index-feature name-Attribute-easy to understand
2. filter the row and column data of dataframe
import pandas as pd,numpy as npfrom pandas import DataFramedf = DataFrame(np.arange(20).reshape((4,5)),column = list('abcde'))
1. df [] & df. Select column data
Df.
Df [['A', 'B']
2. df. loc [[index], [colunm] use tags to select data
When you do not filter rows, enter "(cannot be blank)" in "[index]", that is, "df. loc [:, 'a']" indicates Selecting All data in column.
Df. loc [0, 'a']
Df. loc [0: 1, ['A', 'B']
Df. loc [[0, 2], ['A', 'C']
3. df. iloc [[index], [colunm] Select data by location
When no row is filtered, it cannot be empty in the same way as df. loc [], that is, [index.
Df. iloc [0, 0]
Df. iloc [,]
Df. iloc [[0, 2], [1, 3]
4. df. ix [[index], [column] use the tag or location to select data
Df. ix [] combines tag and location selection. Note that you must specify the selection of the same class in the [index] and [column] boxes.
Df. ix [[0: 1], ['A', 3] Error