Original address
The coding of discrete features is divided into two situations:
1, the value of discrete features do not have the meaning of the size, such as Color:[red,blue], then use one-hot encoding
2, discrete characteristics of the value of the size of the meaning, such as SIZE:[X,XL,XXL], then use the value of the map {X:1,xl:2,xxl:3}
It is convenient to use pandas to one-hot encoding of discrete features
Import pandas as PD
DF = PD. Dataframe ([
[' Green ', ' M ', 10.1, ' Class1 '],
[' Red ', ' L ', 13.5, ' class2 '],
[' Blue ', ' XL ', 15.3, ' Class1 ']] C4/>df.columns = [' Color ', ' size ', ' Prize ', ' Class label ']
size_mapping = {
' XL ': 3,
' L ': 2,
' M ': 1}
df[' size '] = df[' size '].map (size_mapping)
class_mapping = {Label:idx for Idx,label in Enumerate (Set (Df[' class Label '])}
df[' class label ' = Df[' class label '].map (class_mapping)
Description: For discrete features with large and small meanings, direct use of mappings is possible, {' XL ': 3, ' L ': 2, ' M ': 1}
Using the get_dummies would create a new column for every unique string in a certain column: use Get_dummies for One-hot
Coding
Pd.get_dummies (DF)