1.1. example:polynomial Curve Fitting
1. Movitate A number of concepts:
(1) linear models:functions which is linear in the unknow parameters. Polynomail is a linear model. for the Polynomail curve fitting problem, the models is:
Which is a linear model.
(2) Error function:error function measures the misfit between the prediction and the training set point. For instance, sum of the squares of the errors is a simple function, which is widely used, and is given:
(3) model comparison or model selection
(4) Over-fitting:the model Abtains excellent fit to training data and give a very poor performance on test data. And this behavior is known as over-fitting.
(5) Regularization:one technique which is often used to control the over-fitting phenomenon, and it involves adding a pen Alty term to the error function in order to discourage the coefficients from reaching large values. The simplest such penalty term takes the form of a sum of aquares of any of the coefficients, leading to a modified error function of the form:
And this particular case of a quadratic regularizer is called Ridge regression (Hoerl and Kennard, 1970). In the context of neural networks, this approach is known as weight decay.
(6) Validation set, also called a hold-out set:if we were trying to solve a practical application using this approach of Minimizing an error function, we would has to find a by-determine a suitable value for the model complexity. A simple-to-achieving this, namely by taking the available data and partitioning it into a training set, used to deter Mine the coefficients w, and a separate validation set, also called a hold-out set, used to optimize the model complexity.
1.2. Probability theory
1. The rules of probability. Sum rule and product rule.
2. Bayes ' theorem.
3. Probability densities
4. Expectations and Covariances
PRML Reading notes--introduction