Background restatement
This paper is the reproduction process of Table 12.2 in esl:12.3 support vector machine and kernel. The specific questions are as follows:
100 observations are generated in two categories. The first class has 4 standard normal independent features \ (x_1,x_2,x_3,x_4\). The second class also has four standard normal independent features, but the condition is \ (9\le \sum x_j^2\le 16\). This is a relatively simple question. Considering the second more difficult problem, we use 6 standard Gaussian noise features as the augmented feature.
Generate Data
# # ####################################### Generate dataset## # # ' No Noise Features ': num_noise = 0## ' Six Noise Features ' : num_noise = 6## #################################### #genXY <- functionn = -,num_noise = 0) {# # class 1 M1 = Matrix(Rnorm(N (4+num_noise)),Ncol = 4+ Num_noise) # # class 2 m2 = Matrix(Nrow =NNcol = 4+ num_noise) for (I-in1: N) {while (TRUE) {m2[i,] = Rnorm(4+ num_noise) TMP = sum(M2[i,1:4]^2) if (TMP >= 9& TMP <= -) Break}} X = Rbind(M1, M2) Y = Rep(C(1,2),Each =Nreturn(Data.frame(X =XY = As.factor(Y)))}
Model Training
- SVM directly calls
e1071
functions in a package svm
- Both Bruto and Mars are call
mda
packages, and since both are used for regression, when converting to classification, the distance between the fitting value and the category label is compared, and the closer the class is divided
- The original book mentions that the Mars does not limit the number of orders, but the actual programming, the set order of 10
Cross-validation Select the appropriate\ (c\)
I choose in two steps:
- Coarse selection: Finding the optimal (c\) in a larger range
- Subdivision: Subdivision near the best value selected in the previous step
Be careful to avoid the optimal value at the boundary value. Take Svm/poly5 as an example to illustrate, other similar
## SVM/poly5set.seed(123)poly5 = tune.svmdata =kernel ="polynomial"degree =5cost =2^(-4:8))summary(poly5)
The optimal \ (c\) selected at this time is 32, further refinement
set.seed(1234)poly5 = tune.svmdata =kernel ="polynomial"degree =5cost =seq(1664by =2))summary(poly5)
So \ (c\) takes 28.
Similarly, the optimization of other methods (c\), such as an experimental result is as follows:
Method |
| Best Cost
SV Classifier |
2.6 |
Svm/poly 2 |
1 |
Svm/poly 5 |
28 |
Svm/poly 10 |
0.5 |
Of course, in practice we do not need to re-set parameters to train the model, because tune.svm()
the return result contains the optimal model, called directly, such aspoly5$best.model
Calculate Test Error
Predict.mars2 <- function (model, newdata) {pred = predict(model, NewData)IfElse(Pred < 1.5,1,2)}calcerr <- function (model,n = +,Nrep = -,num_noise = 0,method = "SVM") {err = sapply(1: Nrep, function (i) {dat = Genxy(N,num_noise =num_noise) Datx = dat[,-Ncol(DAT)] Daty = dat[,Ncol(DAT)] if (method = ="SVM") pred = predict(Model,NewData =DATX) Else if (method = ="MARS") pred = predict.mars2(Model,NewData =DATX) Else if (method = ="Bruto") pred = predict.mars2(Model,NewData = As.matrix(DATX))sum(Pred! = daty)/(2*n)# attention!! The total number of observations are 2n, not n})return(List(Testerr = mean(ERR),SE = SD(err))}
It is worth noting that for Bruto and Mars, because the program treats it as a regression model, it needs to be further converted to a category label. Because the categories in the program are numbered with 1 and 2, the first class is judged if the fitted value is greater than 1.5, or greater than the second class.
Results
Comparing it with Table 12.2, it can be seen that the error rate of each method and the relative size of the standard deviation are quite consistent.
Bayesian Error Rate
For Category 1,
\[\sum X_j^2\sim \chi^2 (4) \]
For Category 2,
\[\sum X_j^2\sim \frac{\chi^2 (4) I (9\le\chi^2 (4) \le)}{\int_9^{16} f (t) dt}\]
where \ (f (t) \) is the density function of \ (\chi^2 (4) \) .
So the Bayes error rate is
\[\frac{1}{2}\int_{9}^{16}f (t) Dt\approx 0.029\]
The complete code can be found in Skin-of-the-orange. R
Permanent link to this article: impersonation: Tab. 12.2
Compare Svm,mars and Bruto (R language) with a simple example