California Institute of Technology Open Class: machine learning and data mining _radial Basis Function (16th lesson)

Source: Internet
Author: User
Tags svm

Course Description: This paper mainly introduces the RBF model and its comparison with nearest neighbor algorithm, neural network and Kernel method. Finally, the regularization problem of RBF model is introduced.
Course Outline:1, what is the RBF2. RBF and nearest neighbors3. RBF and Neural networks4. RBF and Kernel methods5. RBF and regularization
1, what is the RBFRBF is a model based on the radius. Because the training set (H (x)) is affected by each point in the training focus, the way in which it is affected varies depending on the problem. The main focus of this lesson is on the effect of point-to-model in a training set based on: | | x-xn| | In the form of. That is, RADIUS-based (based on radial). Standard form: (The model below is a Gaussian distribution model, and of course can be used in other models, but since it is radial based, it must exist | | x-xn| | ItemWith the model, we also need to learn its parameters, the above formula has two main parameters: WN and Gamma. Gamma affects the shape of the Gaussian distribution curve (Feishou). Now put down the gamma parameter, first to see how to learn the WN. The premise of learning is to have a guideline. The guideline here is H (xm) = = ym. Where the YM is the true value of the data (for the classification problem is the label). So our problem is to solve the following equation: (for consistency, XM in the following formula corresponds to the xn of the above formula, the XN in the formula below corresponds to the X in the formula above)
The matrix is represented as follows: IfMatrix reversible, there is: (Heard can be solved by inserting method)So far, we can successfully use the training data to find the parameters W, everything is very smooth, then it is not explained as long as we get the gamma to find out?The answer is no, because there is an over-fitting problem. Obviously the solution obtained by the above method, for the sample data, the error is 0, said before, this is not a good thing, because it will lead to reduced generalization ability. The solution used here is: clustering.
2. RBF and nearest neighborsFor the over-fitting problem mentioned in the 1th, the clustering method can be used to solve it.Basic idea: Use some method (for example: K-means) to gather training data into K classes. Each cluster center is trained on behalf of the class.So the model becomes: for the aboveModel, there are two questions:1, how to choose the K center point.2, how to learn wk.1th, the K-means method can be used to solve. Now the main look at the 2nd:Due to the fact that the parameter wk is now K, there are errors in the model: throughTo solve the above equation, we can find out W (how to solve it?) and the linear algebra ... )Now the remaining question is how to solve gamma. The following method is called: the desired maximization of the mixed Gaussian model (EM algorithm in mixture of Gaussians)First step: Fix γ, solve WThe second step: fixing W, the corresponding γ when the model error is minimized.Step three: Jump back to the first step until the termination condition is met. (Iteration m inferior.) )3. RBF and Neural networksThrough the above steps, the RBF model can already be solved. Now look at the comparison with the neural network: throughWe can know:1, the RBF Network and the neural network are in the same form.2, for the RBF network the first level input parameters are fixed: | | x-μi| |, but for neural network, the corresponding parameters need to be learned by reverse propagation.3, for the RBF network when the first level input value is very large, the corresponding node output will become very small (Gaussian model), and for the neural network does not exist this feature, the root of the specific node used by the function. 4. RBF and Kernel methodsThen look at the comparison between RBF and SVM kernel.First in form:SVM KERNEL:RBF:For RBF, additional parameter B is added, and the problem of classification is changed. This is to make it easier to compare with SVM kernel.The first question we care about is: how do they behave? The following figure shows the performance of the two models (the Green line represents the target function): can see, although it is a model from two different worlds, but they are very close (SVM a bit better), but in the specific problem, it is difficult to know who is better.Note that the number of clusters used in RBF k = = The number of support vectors in SVM.5. RBF and regularization
Note: about how K is selected in clustering? I began to think whether the VC dimension can be calculated as a reference? The student asked the question at the end of the class, but the professor said he could not do it. is K--VC instead of vc->k.

California Institute of Technology Open Class: machine learning and data mining _radial Basis Function (16th lesson)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.