This paper mainly explains the local weighted (linear) regression. Before explaining the local weighted linear regression, we first explain two concepts: under-fitting, over-fitting, and the local weighted linear regression algorithm.
Under fitting, over fitting
such as the three fit models. The first one is a linear model, the training data fitting is not good enough, the loss function value is large. In the second model, if we add a new feature item to the linear model, the fitting result will be better. The third one is a model with a 5-order polynomial that fits almost perfectly into the training data.
Model one does not have the very good fitting training data, has the big error in the training data as well as in the test data, this condition is called under-fitting (underfitting).
Model three to fit the training data is very good, but the accuracy of the test data is not ideal. This fits the training data well, and the low accuracy on the test data is called overfitting (overfitting).
Local weighted linear regression (locally weighted linear regression,lwr)
We can see from the example of the above-fitting and overfitting that in the regression prediction model, the accuracy of the predictive model is particularly dependent on the selection of features. Improper selection of features often leads to a vastly different forecast result. Local weighted linear regression is a good solution to this problem, its predictive performance is not very dependent on the selected features, but also good to avoid the risk of overfitting and overfitting.
Before understanding the local weighted linear regression, recall the linear regression first. The loss function of linear regression considers the samples in the training data equal and does not have the concept of weight. For more information on linear regression, please refer to linear regression, gradient descent, and its main ideas are:
While local weighted linear regression, the weighted w is added to the structure of loss function, the training samples which are closer to the prediction point are given higher weights, and the training samples which are farther from the prediction point give smaller weights. The value range of the weights is (0,1).
The main ideas of local weighted linear regression are:
Where weights are assumed to conform to the formula
The weight size in the formula depends on the distance between the predicted point X and the training sample. If |-x| is smaller, then the value is close to 1, and vice versa is close to 0. The parameters tau, called bandwidth, are used to control the amplitude of the weights.
The advantage of local weighted linear regression is that it is less dependent on feature selection, and it only needs to train a good fitting model with linear model.
However, because local weighted linear regression is a non-parametric learning algorithm, the loss number varies with the predicted value, so that θ cannot be determined beforehand, and every prediction needs to scan all the data to recalculate θ, so the computational amount is larger.
Copyright NOTICE: This article for Bo Master http://www.zuiniusn.com original article, without Bo Master permission not reproduced.
Local weighted regression, under-fitting, over-fitting-Andrew ng machine Learning public Lesson Note 1.3