Pandas.get_dummies discrete feature coded _

Pandas.get_dummies discrete feature coded __ encoding

Last Update:2018-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Import NumPy as NP from pandas import Series, dataframe import pandas as PD df = dataframe {' key ': [' B ', ' B ', ' A ', ' C ',      ' A ', ' B '], ' data1 ': Range (6)}) pd.get_dummies (df[' key ') print (DF) ' data1 key 0 0 B 1 1 B 2 2 a 3 3 C 4 4 A 5 5 B ' ' dummies = pd.get_dummies (df[' key '), prefix= ' key ') Df_with _dummy = df[[' data1 ']].join (dummies) print (df_with_dummy) ' ' Data1 key_a-key_b key_c 0 0 0.0 1.0 0.
0 1 1 0.0 1.0 0.0 2 2 1.0 0.0 0.0 3 3 0.0 0.0 1.0 4 4 1.0 0.0-0.0 5 5 0.0 1.0 0.0 ' "' pandas.get_dummies discrete feature coding is divided into two cases: 1, the value of discrete features is not the size of the meaning, such as color:[red,blue], then use One-hot
Code 2, discrete characteristics of the value of the size of the meaning, such as SIZE:[X,XL,XXL], then use the map of the value of {X:1,xl:2,xxl:3} using Pandas can be very convenient for discrete features one-hot encoding parameters: Data: Array, series, or frame Prefix: prefix string, string list, or dictionary string, default no prefix_sep:sep prefix, string, default _, if added prefix, separator/separator to use.
or a list or dictionary as a prefix. Dummy_na:bool whether to show NAN a column. Default to False does not display, true to display columns: Similar list, default does not. Encodes the column name in the Data box.
If the column is not, all column objects or types will be converted.
Sparse:bool is sparse and defaults to false. Drop_first:bool whether to remove the first column defaults to false ' df = PD '.     
            Dataframe ([[' Green ', ' M ', 10.1, ' Class1 '], [' Red ', ' L ', 13.5, ' class2 '],    
    
[' Blue ', ' XL ', 15.3, ' Class1 ']] Df.columns = [' Color ', ' size ', ' Prize ', ' Class label '] size_mapping = {' XL ': 3, ' L ': 2 , ' M ': 1} df[' size '] = df[' size '].map (size_mapping) class_mapping = {Label:idx for idx,label in EN Umerate (Set (df[' Class label '))} df[' class label ' = Df[' class label '].map (class_mapping) print (DF) ' Colo            R Size Prize class label 0 Green 1 10.1 1 1 Red 2 13.5 0 2 Blue 3 15.3
1 ' df=pd.get_dummies (DF) # dummies English meaning Imitation Print (DF) # using Get_dummies for One-hot encoding, before and after the application of single heat code to notice the change of color column is as follows  ' Size Prize class label Color_blue color_green color_red 0 1 10.1 1 0.0        1.0 0.0 1 2 13.5 0 0.0 0.0 1.0 2 3 15.3-1 1.0 0.0 0.0 "", thinning matrix S = pd.
Series (List (' ABCA ')) print (Pd.get_dummies (s)) ' ' A B C 0 1.0 0.0 0.0 1 0.0 1.0 0.0 2 0.0 0.0 1.0   3 1.0 0.0 0.0 ' S1 = [' A ', ' B ', Np.nan] Print (pd.get_dummies (S1)) ' ' A B 0 1.0 0.0 1 0.0 1.0 2 0.0 
0.0 ' ### Display nan column print (pd.get_dummies (S1, dummy_na=true)) ' A B Nan 0 1.0 0.0 0.0 1 0.0 1.0 0.0
     2 0.0 0.0 1.0 ' ### Drop_first = True Remove the first column print (Pd.get_dummies (S1, dummy_na=true, Drop_first = True)) b NaN 0 0.0 0.0 1 1.0 0.0 2 0.0 1.0 ' ' ### data one by one corresponds, add column name prefix demo_1 = PD. Dataframe ({' A ': [' A ', ' B ', ' A '], ' B ': [' B ', ' A ', ' C '], ' C ': [1, 2, 3]} print_demo_1 = Pd.get_dummies (demo_1, prefix=[' Co L1 ', ' col2 '] print (print_demo_1) ' C col1_a col1_b col2_a col2_b col2_c 0 1 1.0 0.0 0.0 1.  0 0.0 12 0.0 1.0 1.0 0.0 0.0 2 3 1.0 0.0 0.0 0.0 1.0 ' ' #详情: https://pandas.pydata.org
 /pandas-docs/stable/generated/pandas.get_dummies.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pandas.get_dummies discrete feature coded __ encoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Pandas.get_dummies discrete feature coded __ encoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support