Bias and variance
Reference: http://scott.fortmann-roe.com/docs/BiasVariance.html
http://www.cnblogs.com/kemaswill/
Bias-variance decomposition is an important analytical technique in machine learning. Given the learning goal and the training set scale, it can decompose the expected error of a learning algorithm into three nonnegative terms, that is, the true noise, bias and variance.
The real noise is the lower bound of the expected error of any learning algorithm on the learning target; (errors that cannot be overcome by any method)
Bias measures the degree to which the average estimated results of a learning algorithm can approximate the learning target, (the error that is independent of the training sample depicts the accuracy and quality of the match: a high bias means a bad match)
Variance measures the degree to which the estimated results of the learning algorithm change in the face of different training sets of the same size. (the error associated with the observed sample depicts the accuracy and specificity of a learning algorithm: A high variance means a weak match)
Biasing depicts the difference between a built model and a real model. For example, the real model reflected by the data set is two times model, but the result of the model is the linear model, and the results are always different from the real value results, which is caused by the inaccuracy of the model, that is biased bias; as in the following two graphs, The real model is the heart (that is, every time it is aimed at the heart), but the model is built to deviate from the heart (that is, to aim at the direction of the heart when shooting).
Variance depicts the stability of the built model itself. For example, the dataset itself is a two-time model, but a three-time model is built, for a number of different training sets, you can get a number of different three models, then for a fixed test point, these different three times the model to obtain multiple estimates, the difference between these estimates is the variance of the model, such as the right of the two figure, Regardless of whether the model being built is aimed at Hearts, there is a big difference between the multiple results of each model.
Bias-Variance decomposition (bias-variance decomposition) is the viewpoint of the statistical school on the complexity of the model.
Suppose we have K datasets, each dataset is extracted independently from a distribution P (t,x) (t represents the variable to be predicted, and X represents the feature variable). For each dataset D, we can train a model Y (X;D) based on the learning algorithm. Different models can be obtained by training on different datasets. The performance of the learning algorithm is measured by the average performance of the K models that are trained on the K datasets, i.e.:
The h (x) represents the real function that generates the data, i.e. t=h (x).
We can see that the error between the model and the real function h (x) of a given learning algorithm in multiple datasets is made up of two parts: bias (Bias) and variance (Variance). The bias describes the average error between the learned model and the real function, while the variance describes the average error between the learned model and the average of multiple models.
Bias and Variance bias and variance