The pandas module in Python steps to repeat data:
1) using the duplicated method in Dataframe to return a Boolean series, showing whether there are duplicate rows in each row, no duplicate rows displayed as false, and duplicate rows showing as true;
2) Use the Drop_duplicates method in Dataframe to return a dataframe that removes duplicate rows.
Comments:
If a parameter is not set in the duplicated method and the Drop_duplicates method, the two methods are judged by default, if the specified property name (or column name) is added to the two methods, for example: frame.drop_duplicates ([' State '], specifies that the partial column (state column) will be judged for duplicates.
Specific examples are as follows:
>>> import pandas as PD
>>> data={' state ': [1,1,2,2], ' Pop ': [' A ', ' B ', ' C ', ' d ']}
>>> FRAME=PD. Dataframe (data)
>>> frame
pop State
0 a 1
1 b 1
2 C 2
3 d 2
>>> isduplicated=frame.duplicated ()
>>> Print isduplicated
0 false
1 false
2 False
3 false
Dtype:bool
> >> frame=frame.drop_duplicates ([' State '])
>>> frame
pops State
0 a 1
2 C 2
>>> isduplicated=frame.duplicated ([' State '])
>>> Print isduplicated
0 false
2 false
Dtype:bool