Data conversion
Delete duplicate elements
The duplicated () function of the Dataframe object can be used to detect duplicate rows and return a series object with the Boolean type. Each element pairs
should be a row, if the row repeats with other rows (that is, the row is not the first occurrence), the element is true, and if it is not repeated with the preceding, the meta
The vegetarian is false.
A Series object that returns an element as a Boolean is of great use and is particularly useful for filtering operations. in general, all duplicate rows need to be from the Dataframe
object is deleted. The Drop_duplicates () function of the pandas library actually deletes the function, which returns the Datafmme after the duplicate row is deleted .
Like.
1Dframe = PD. DataFrame ({'Color': [' White',' White','Red','Red',' White'],'value': [2,1,3,3,2]})2 Print(Dframe)3 Print(dframe.duplicated ())4 #A Series object that returns an element as a Boolean is of great use and is particularly useful for filtering operations. 5 Print(dframe[dframe.duplicated ()])6 Print(Dframe.drop_duplicates ())7 Output:8 Color Value90 White 2Ten1 White 1 One2 Red 3 A3 Red 3 -4 White 2 - 0 False the1False -2False -3True -4True + Dtype:bool - Color Value +3 Red 3 A4 White 2 at Color Value -0 White 2 -1 White 1 -2 Red 3
Replacing elements with mappings
To replace an incorrect element with a new element, you need to define a set of mapping relationships. In a mapping relationship, the old element is the key and the new element is the value.
The two old colors in the Dataframe object are replaced with the correct elements. Another common scenario is to replace Nan with other values, such as 0.
In this case, you can still use the Replace () function, which can gracefully complete the operation.
1Frame8 =PD. DataFrame ({2 'Item': [' Ball','Mug','Pen','Pencil','Ashtray'],3 'Color': [' White','Rosso','Verde','Black','Yellow'],4 ' Price': [5.56, 4.20, 1.30, 0.56, 2.75]5 })6 Print(Frame8)7Newcolors = {8 'Rosso':'Red',9 'Verde':'Green'Ten } One Print(Frame8.replace (newcolors)) A -Ser = pd. Series ([Np.nan, 4, 6, Np.nan, 3]) - Print(Ser.replace (Np.nan, 0))
Output Result:
Adding elements with mappings
Here are just a few of the features, please refer to the official documentation for details.
1Frame9 =PD. DataFrame ({2 'Item':[' Ball','Mug','Pen','Pencil','Ashtray'],3 'Color':[' White','Red','Green','Black','Yellow']4 })5 Print(FRAME9)6Price = {7 ' Ball': 5.56,8 'Mug': 4.20,9 'Bottle1': 1.30,Ten 'Scissors': 3.41, One 'Pen': 1.30, A 'Pencil': 0.56, - 'Ashtray': 2.75 - } theframe9[' Price'] = frame9['Item'].map (Price) # here is the corresponding relationship by ' item ' added - Print(FRAME9)
Output Result:
Official Document Case:
1DF =PD. DataFrame ({2 'A': ['Bat','Foo','Ibat'],3 'B': ['ABC','Bar','XYZ']4 })5 #R ' ^ba.$ ' is the last three characters that match the previous amount of BA; $ matches the end of the6 Print(Df.replace (to_replace=r'^ba.$', value='New', regex=true))
Output: (Please click on the Reference blog For more information on the regular point above)
Rename Axis index
Python Data Analysis Library pandas------Pandas