So, here we use a two-step training method to combine the SVM method with the logistic regression, the first step is to get the WSVM and BSVM by SVM, and then we get the W and B, using the above method to do the logistic Regression training, through the two parameters of A and b to the contraction and the final results if a>0, then WSVM is good, B is close to 0, BSVM is also reliable.
Here we summarize the steps of the Platt ' s model into the following steps:
Because of the existence of B, so there is a certain translation effect, so the soft binary classifier with the SVM boundary has a certain difference, and then for the second step of the logistic regression can be used GD/SGD and other solutions to the local Method of minimum.
The important concept of the kernel method is that the last obtained optimal w is the linear combination of Zn. This is the same for SVM, PLA, and Logreg.
It is proved that for any l2-regularized model, the optimal solution W can be expressed as a linear combination of Zn. The way of proving is to disprove the law.
So the representer theorem applied to kernel Logistic regression up, you can use GD/SGD to do optimization solution.
From another point of view, the KLR is a beta linear combination, the coefficient is K (XN,XM), but it should be noted here, beta in KLR is basically a value, unlike SVM, only the SV, an is not equal to 0
Kernel Logistic Regression