GBM and GBDT and Xgboost

Source: Internet
Author: User
Tags xgboost
GBM and GBDT and Xgboost

Gradient Boost decision Tree is currently a very popular machine learning algorithm (supervised learning), this article will be from the origin of the GBDT, and introduce the current popular xgboost. In addition, "Adaboost detailed", "GLM (generalized linear model) and LR (logistic regression) detailed" is the basis of this paper. 0. Hello World

Here is a list of the simplest and most common GBDT algorithms.
For regression problems, the GBDT consists of a set of decision tree directly composed of ensemble F (x) =∑tt=1ft (x) f (x) =\sum_{t=1}^t f_t (x) to fit the target Y y.

Each round of iteration T T will look for a sub-function ft f_t integrated into F (x) f (x), as follows:
Argminft (x) (ft (x) −residual) 2, residual= (ft−1 (x) −y) arg \min_{f_t (x)} (f_t (x)-residual) ^2,\ residual= (F_{t-1} (x)-y)

Residual is the residual of the overall model after the end of the previous round, so GBDT will focus on the samples that were not processed in the previous iteration, step by step to refine the details to achieve better fit (and of course there are overfitting problems).

In fact, GBDT in fact contains a very wide range of ideas and applications, this article will be elaborated in detail. 1. Some Foundations

GBDT contains a number of common conceptual approaches to machine learning, and several important basic concepts are presented here. 1.1 Gradient Descent

Gradient descent is machine learning wellknown fashion, for an optimization target (loss function), we want to take a small step from the current position (model State), so that loss the fastest drop (decrease). Here is the first-order Taylor expansion of the loss function:
min| | v| | =1E (WT+ΗV) ≈e (WT) +ηv∇e (WT) \min_{| | v| | =1}e (W_t+\eta v) \approx E (w_t) +\eta V\nabla E (w_t)

Wt w_t means that the current weight vector v v indicates the direction of the walk (vector, with length constraint) when the T-round is iterated, and the step length (scalar, usually a small positive number) Η\eta represents the Walk

The goal is to find a loss function to drop the fastest direction v V, through other mathematical tools can prove that the optimal direction is negative gradient direction v=−∇e (

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.