Pandas.get_dummies discrete feature coded __ encoding

Source: Internet
Author: User
Import NumPy as NP from pandas import Series, dataframe import pandas as PD df = dataframe {' key ': [' B ', ' B ', ' A ', ' C ',      ' A ', ' B '], ' data1 ': Range (6)}) pd.get_dummies (df[' key ') print (DF) ' data1 key 0 0 B 1 1 B 2 2 a 3 3 C 4 4 A 5 5 B ' ' dummies = pd.get_dummies (df[' key '), prefix= ' key ') Df_with _dummy = df[[' data1 ']].join (dummies) print (df_with_dummy) ' ' Data1 key_a-key_b key_c 0 0 0.0 1.0 0.
0 1 1 0.0 1.0 0.0 2 2 1.0 0.0 0.0 3 3 0.0 0.0 1.0 4 4 1.0 0.0-0.0 5 5 0.0 1.0 0.0 ' "' pandas.get_dummies discrete feature coding is divided into two cases: 1, the value of discrete features is not the size of the meaning, such as color:[red,blue], then use One-hot
Code 2, discrete characteristics of the value of the size of the meaning, such as SIZE:[X,XL,XXL], then use the map of the value of {X:1,xl:2,xxl:3} using Pandas can be very convenient for discrete features one-hot encoding parameters: Data: Array, series, or frame Prefix: prefix string, string list, or dictionary string, default no prefix_sep:sep prefix, string, default _, if added prefix, separator/separator to use.
or a list or dictionary as a prefix. Dummy_na:bool whether to show NAN a column. Default to False does not display, true to display columns: Similar list, default does not. Encodes the column name in the Data box.
If the column is not, all column objects or types will be converted.
Sparse:bool is sparse and defaults to false. Drop_first:bool whether to remove the first column defaults to false ' df = PD '.     
            Dataframe ([[' Green ', ' M ', 10.1, ' Class1 '], [' Red ', ' L ', 13.5, ' class2 '],    
    
[' Blue ', ' XL ', 15.3, ' Class1 ']] Df.columns = [' Color ', ' size ', ' Prize ', ' Class label '] size_mapping = {' XL ': 3, ' L ': 2 , ' M ': 1} df[' size '] = df[' size '].map (size_mapping) class_mapping = {Label:idx for idx,label in EN Umerate (Set (df[' Class label '))} df[' class label ' = Df[' class label '].map (class_mapping) print (DF) ' Colo            R Size Prize class label 0 Green 1 10.1 1 1 Red 2 13.5 0 2 Blue 3 15.3
1 ' df=pd.get_dummies (DF) # dummies English meaning Imitation Print (DF) # using Get_dummies for One-hot encoding, before and after the application of single heat code to notice the change of color column is as follows  ' Size Prize class label Color_blue color_green color_red 0 1 10.1 1 0.0        1.0 0.0 1 2 13.5 0 0.0 0.0 1.0 2 3 15.3-1 1.0 0.0 0.0 "", thinning matrix S = pd.
Series (List (' ABCA ')) print (Pd.get_dummies (s)) ' ' A B C 0 1.0 0.0 0.0 1 0.0 1.0 0.0 2 0.0 0.0 1.0   3 1.0 0.0 0.0 ' S1 = [' A ', ' B ', Np.nan] Print (pd.get_dummies (S1)) ' ' A B 0 1.0 0.0 1 0.0 1.0 2 0.0 
0.0 ' ### Display nan column print (pd.get_dummies (S1, dummy_na=true)) ' A B Nan 0 1.0 0.0 0.0 1 0.0 1.0 0.0
     2 0.0 0.0 1.0 ' ### Drop_first = True Remove the first column print (Pd.get_dummies (S1, dummy_na=true, Drop_first = True)) b NaN 0 0.0 0.0 1 1.0 0.0 2 0.0 1.0 ' ' ### data one by one corresponds, add column name prefix demo_1 = PD. Dataframe ({' A ': [' A ', ' B ', ' A '], ' B ': [' B ', ' A ', ' C '], ' C ': [1, 2, 3]} print_demo_1 = Pd.get_dummies (demo_1, prefix=[' Co L1 ', ' col2 '] print (print_demo_1) ' C col1_a col1_b col2_a col2_b col2_c 0 1 1.0 0.0 0.0 1.  0 0.0 12 0.0 1.0 1.0 0.0 0.0 2 3 1.0 0.0 0.0 0.0 1.0 ' ' #详情: https://pandas.pydata.org
 /pandas-docs/stable/generated/pandas.get_dummies.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.