The multiclass_classification of Xgboost learning examples

Last Update:2018-07-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The previous article used the Xgboost CLI to do two categories, now to do a multiple classification.

The data set used is the UCI skin disease set

A total of 34 attribute sets, 6 categories of labels, property set in addition to family history is a nominal value, the other is a linear value line values

7. Attribute information:--Complete attribute documentation:clinical Attributes: (Take values 0, 1, 2, 3, Unle SS otherwise indicated) 1:erythema 2:scaling 3:definite borders 4:itching 5:koebner PHE Nomenon 6:polygonal papules 7:follicular papules 8:oral mucosal involvement 9:knee and elbow I Nvolvement 10:scalp Involvement 11:family history, (0 or 1) 34:age (linear) histopathological Attr Ibutes: (Take values 0, 1, 2, 3) 12:melanin incontinence 13:eosinophils in the infiltrate 14:PNL Ate 15:fibrosis of the papillary dermis 16:exocytosis 17:acanthosis 18:hyperkeratosis 19:par Akeratosis 20:clubbing of the Rete ridges 21:elongation of the rete ridges 22:thinning of the Suprapapil Lary Epidermis 23:spongiform pustule 24:munro microabcess 25:focal hypergranulosis 26:disappearanc
E of the granular layer     27:vacuolisation and damage of basal layer 28:spongiosis 29:saw-tooth appearance of Retes 30:folli Cular Horn plug 31:perifollicular parakeratosis 32:inflammatory monoluclear inflitrate 33:band-like infi Ltrate

The real data is as long as this:

1,1,2,3,2,2,0,3,0,0,0,2,0,0,0,2,2,1,2,0,0,0,0,0,3,0,3,0,3,1,0,2,3,50,3
3,2,1,2,0,0,0,0,1,2,0,0,0,1,0,0,2,0,3,2,2,2,1,2,0,2,0,0,0,0,0,1,0,50,1
3,2,0,2,0,0,0,0,0,0,0,0,1,2,0,2,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,10,2
2,3,3,3,3,0,0,0,3,3,0,0,0,0,0,0,3,2,2,3,3,3,1,3,0,0,0,0,0,0,0,1,0,34,1
2,2,1,0,0,0,0,0,1,0,1,0,0,2,0,0,2,1,2,2,1,2,0,1,0,0,0,0,0,0,0,0,0,?, 1
2,1,0,0,2,0,0,0,0,0,0,0,0,0,0,2,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,?, 4
2,2,1,2,0,0,0,0,0,0,0,0,0,2,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,?, 2
2,1,2,3,2,3,0,2,0,0,1,1,0,0,0,2,1,1,2,0,0,0,0,0,1,0,2,0,2,0,0,0,3,?, 3

Last column label, question mark indicating age unknown

, this time we trained in Python script instead of Xgboost CLI, which looks more convenient:

train.py:

#!/usr/bin/python from __future__ Import Division import numpy as NP import xgboost as XGB # label need to is 0 to Num_  Class-1 data = Np.loadtxt ('./dermatology.data ', delimiter= ', ', Converters={33:lambda x:int (x = = '? '), 34:lambda X:int (x)-1}) Sz = Data.shape train = Data[:int (sz[0] * 0.7),:] test = Data[int (sz[0) * 0.7):,:] train_x = train[:,: 3 3] train_y = train[:, test_x = test[:,: test_y = test[:, Xg_train = XGB. Dmatrix (train_x, label=train_y) xg_test = XGB. Dmatrix (test_x, label=test_y) # setup parameters for Xgboost param = {} # use Softmax multi-class classification param[' OB Jective '] = ' multi:softmax ' # scale weight of positive examples ' eta '] = param[0.1 ' param['] = 6 max_depth ' param['] = 1 param[' nthread '] = 4 param[' num_class '] = 6 watchlist = [(Xg_train, ' Train '), (xg_test, ' Test ')] Num_round = 5 BST = Xgb.train (param, Xg_train, Num_round, watchlist) # get prediction pred = bst.predict (xg_test) error_rate = np.sum (pred!= test_y)/test_y.shape[0] Print (' Test error using Softmax = {} '. Format (error_rate)) # Do the same thing again, but output probabilities param[' objective '] = ' multi:softprob ' BST = Xgb.train (param, Xg_train, Num_round, watchlist) # Note:this Convention has been changed since xgboost-unity # get prediction, this are in 1D array, need reshape to (Ndata, nclass) Pred_prob = BST.PR Edict (Xg_test). Reshape (Test_y.shape[0], 6) Pred_label = Np.argmax (Pred_prob, axis=1) error_rate = np.sum (pred!= Test_Y)
 /test_y.shape[0] Print (' test error using Softprob = {} '. Format (error_rate))

The code is also very simple, it is worth mentioning that is to deal with the Age column data, the label changed to 0 from the beginning of the notation, and two training methods:

Multi:softprob

And

Multi:softmax

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More