data conversion refers to filtering, cleaning, and other conversion operations on the data.
Remove Duplicate data Repeating rows often appear in the Dataframe, Dataframe provides a duplicated () method to detect whether rows are duplicated, and another drop_duplicates () method to discard duplicate rows:
Duplicated () and Drop_duplicates () methods defaultJudging all Columns, if you do not want to, the collection of incoming columns as a parameter can be specified as a column, for example:
Duplicated () and drop_duplicates () methodsThe default holds the first occurrence of a value, the incoming take_last=true retains the last value:
Data Transformation with mappings
The same effect can be achieved with functions:
Replace value The replace () method is used for substitution:
Replace multiple values at once:
Different substitutions for different values:
dataframe Renaming an axis index To rename a column:
To rename an index:
divide the data into groups:
Detecting and filtering outlier values Suppose you have a set of data:
Find values with absolute value greater than 2:
Find a line with an absolute value greater than 2:
Set the exception value to 0:
Data analysis using Python Pandas Fundamentals: Data Conversion