GBDT Multi-classification example

Source: Internet
Author: User

Sample number Calyx Length (cm) Calyx width (cm) Petal Length (cm) Petal width Types of flowers
1 5.1 3.5 1.4 0.2 Mountain Iris
2 4.9 3.0 1.4 0.2 Mountain Iris
3 7.0 3.2 4.7 1.4 Variegated Iris
4 6.4 3.2 4.5 1.5 Variegated Iris
5 6.3 3.3 6.0 2.5 Virginia Iris
6 5.8 2.7 5.1 1.9 Virginia Iris

Iris Data Set

This is a three classification problem with 6 samples. We need to determine the length of the calyx, the width of the calyx, the length of the petals, the width of the petals, which are the iris of the mountain, the variegated iris, or the Iris of Virginia. It is applied to GBDT Multi-classification algorithm. We use a three-dimensional vector to mark the label of the sample. [1,0,0] indicates that the sample belongs to the iris, [0,1,0] that the sample belongs to the noise iris, [0,0,1] is the iris of Virginia.

GBDT's multi-classification is to train a CART Tree independently for each class. So here we will train a CART Tree 1 for the iris category. The noise Iris trains a CART Tree 2. Virginia Iris Training A cart Tree 3, these three trees are independent of each other.

Let's take sample 1 as an example. The training sample for CART Tree1 is[5.1,3.5,1.4,0.2] , the label is 1, and the final input into the model is[5.1,3.5,1.4,0.2,1] 。 Training samples for CART Tree2 are also[5.1,3.5,1.4,0.2] , but the label is 0, and the final input model is[5.1,3.5,1.4,0.2,0] . The training sample for CART Tree 3 is also[5.1,3.5,1.4,0.2] , the label is also 0, and the final input model is [ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] ">[5.1, 3.5,1.4,0.2,0).

[ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] "> below we see how the cart Tree1 is generated, the other tree cart Tree2, the cart Tree 3 is generated the same way. The build process of the cart tree is to find a feature in these four features as the node of the cart Tree1. For example, calyx length as a node. In 6 samples, the length of the calyx is greater than 5.1 cm, which is Class A, which is less than or equal to 5.1 cm. The resulting process is very simple, question 1. Which is the most appropriate feature? 2. What is the characteristic value of this feature as a segmentation point? Even if we have determined the length of the calyx as a node. The length of the calyx itself also has many values. Our approach here is to iterate through all the possibilities, to find a best feature and its corresponding optimal eigenvalues to minimize the value of the current equation.

[ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] "> [ 5.1 , 3.5 , 1.4 , 0.2 ] "> [ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] ">

Let's take the first eigenvalue as an example. The R1 is a collection of samples with calyx lengths of less than 5.1 cm and R2 for all samples with calyx lengths greater than or equal to 5.1cm. SoR1={2} ,R2={1,3,4,5,6} .

A tree, this is the training, a sample, it has a tag, the corresponding tag into the corresponding tag of the tree, put into the other tree, when the mark is 0, put into the corresponding tree to go to the time, labeled 1, so that each training m tree, m for the total number of classes, training K-wheel down there is M *k Tree

GBDT Multi-classification example

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.