The study of this class, I believe that generally on the statistics or logistics related courses should be known to some students. Although the knowledge involved in class is very basic, it is also very important.
Based on the collection of some house price related data, the linear regression algorithm is used to forecast the house price.
In order to facilitate the training deduction of the algorithm, a lot of symbols of the standard provisions, from which also learned some knowledge, later in the deduction of some of the algorithm can also learn the class of these standard symbols and the way to deduce.
Some of the dry goods in this class are given below.
1. Basic framework of machine learning algorithms
2. Least squares--the common cost function of linear regression, that is, squared error and minimum
3. Parametric learning algorithm-gradient descent algorithm, including batch gradient descent and random gradient descent.
The gradient descent can converge, but it may also be the local optimal solution, but if the objective function is a convex function, then the gradient descent will be able to find the global optimal solution.
When the training sample is very large, each update parameter needs to traverse all the sample calculation total error, so that the learning speed is too slow; this time the random gradient descent algorithm that calculates the error update parameters of a sample is usually more than
The batch gradient descent method is faster. (Theoretically, there is no guarantee that the random gradient descent can converge)
4. For the least squares of linear regression problems, it is not necessary to use the gradient descent method to search for the optimal solution, which can be proved by matrix theory.
The final parametric solution is: θ= (XTX) -1xty
Stanford University-machine learning public class-2. Supervised learning applications • Gradient descent