Sample number |
Calyx Length (cm) |
Calyx width (cm) |
Petal Length (cm) |
Petal width |
Types of flowers |
1 |
5.1 |
3.5 |
1.4 |
0.2 |
Mountain Iris |
2 |
4.9 |
3.0 |
1.4 |
0.2 |
Mountain Iris |
3 |
7.0 |
3.2 |
4.7 |
1.4 |
Variegated Iris |
4 |
6.4 |
3.2 |
4.5 |
1.5 |
Variegated Iris |
5 |
6.3 |
3.3 |
6.0 |
2.5 |
Virginia Iris |
6 |
5.8 |
2.7 |
5.1 |
1.9 |
Virginia Iris
|
Iris Data Set
This is a three classification problem with 6 samples. We need to determine the length of the calyx, the width of the calyx, the length of the petals, the width of the petals, which are the iris of the mountain, the variegated iris, or the Iris of Virginia. It is applied to GBDT Multi-classification algorithm. We use a three-dimensional vector to mark the label of the sample. [1,0,0] indicates that the sample belongs to the iris, [0,1,0] that the sample belongs to the noise iris, [0,0,1] is the iris of Virginia.
GBDT's multi-classification is to train a CART Tree independently for each class. So here we will train a CART Tree 1 for the iris category. The noise Iris trains a CART Tree 2. Virginia Iris Training A cart Tree 3, these three trees are independent of each other.
Let's take sample 1 as an example. The training sample for CART Tree1 is[5.1,3.5,1.4,0.2] , the label is 1, and the final input into the model is[5.1,3.5,1.4,0.2,1] 。 Training samples for CART Tree2 are also[5.1,3.5,1.4,0.2] , but the label is 0, and the final input model is[5.1,3.5,1.4,0.2,0] . The training sample for CART Tree 3 is also[5.1,3.5,1.4,0.2] , the label is also 0, and the final input model is [ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] ">[5.1, 3.5,1.4,0.2,0).
[ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] "> below we see how the cart Tree1 is generated, the other tree cart Tree2, the cart Tree 3 is generated the same way. The build process of the cart tree is to find a feature in these four features as the node of the cart Tree1. For example, calyx length as a node. In 6 samples, the length of the calyx is greater than 5.1 cm, which is Class A, which is less than or equal to 5.1 cm. The resulting process is very simple, question 1. Which is the most appropriate feature? 2. What is the characteristic value of this feature as a segmentation point? Even if we have determined the length of the calyx as a node. The length of the calyx itself also has many values. Our approach here is to iterate through all the possibilities, to find a best feature and its corresponding optimal eigenvalues to minimize the value of the current equation.
[ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] "> [ 5.1 , 3.5 , 1.4 , 0.2 ] "> [ 5.1 , 3.5 , 1.4 , 0.2 , 0 ] ">
Let's take the first eigenvalue as an example. The R1 is a collection of samples with calyx lengths of less than 5.1 cm and R2 for all samples with calyx lengths greater than or equal to 5.1cm. SoR1={2} ,R2={1,3,4,5,6} .
A tree, this is the training, a sample, it has a tag, the corresponding tag into the corresponding tag of the tree, put into the other tree, when the mark is 0, put into the corresponding tree to go to the time, labeled 1, so that each training m tree, m for the total number of classes, training K-wheel down there is M *k Tree
GBDT Multi-classification example