2017.03.07 Review GBDT front tree weights larger python scatter graphs __python

Source: Internet
Author: User

1. Determine data transmission format

2, looked at the next pass rate aspect of the problem

3, the afternoon began to study GBDT related issues, my first question is whether the tree in front of the right major, for my actual dataset most of the sample point is such, a little more abnormal, but I later changed a standard dataset, because the dataset is too good, the loss function directly converge to 0, So the predicted value of each point is convergent to a very large very small value, under the Expit function, the approximate absolute value of 8, the value is very close to 1 or 0, and then the front convergence curve is basically close to a straight line, do not see the rate change, I later output the delta value, It is true that the delta values of the previous trees are larger, and for some sample points, the delta value output is very beautiful, each tree in turn to reduce, and finally began to converge to a certain value, the reason is close to a straight line, mainly in front of the tree weight and not large enough to the naked eye can be very good to distinguish, so I mistakenly thought that no change It seems that the different data sets, the degree of change in the previous tree is not the same size naturally, there is a possibility that the front 1 trees as the back 3 trees, there may be a tree when five trees, so I can not support my own guess, so I basically could come to the conclusion that for GBDT, the front of the tree is more important than the tree behind, Through this research, I also observed some other phenomena, my own data set, convergence speed is very slow, 20000 trees are difficult to converge, convergence curve is more and more smooth, in this process, the test set on the AUC is getting lower, apparently there has been a fitting. Sometimes the value of a point can have a inflexion.

4. Draw Scatter plot

Import matplotlib as Plt
plt.scatter (x,y)
Plt.xlabel (' x ')
plt.ylabel (' y ') plt.show (
)

5, I later studied the next scatter plot to synthesize a curve, did not find a ready-made Python module



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.