Book next to the above
Using support vector Machine (SVM) for data mining in R (above)
http://blog.csdn.net/baimafujinji/article/details/49885481
The second way to use the SVM () function is to build a model based on the data given. This is a more complex form, but it allows us to build models in a more flexible way. Its function is formatted as follows (note that we have listed only the main parameters).
SVM (x, y = null, scale = TRUE, type = NULL, kernel = "radial", Degree = 3, gamma = if (Is.vector (x)) 1 else 1/ncol (x), Coe F0 = 0, cost = 1, Nu = 0.5, subset, na.action = Na.omit)
Here, X can be a data matrix or a data vector, and it can also be a sparse matrix. Y is the result label for x data, which can be either a character vector or a numeric vector. X and y collectively specify the training data to be modeled and the basic form of the model.
The parameter type is used to specify the category in which the model is established. Support vector machine models can often be used as classification models, regression models, or anomaly detection models. Depending on the difference in usage, the type desirable values in the SVM () function are c-classification, nu-classification, One-classification, Eps-regression and Nu-regression are among the five types. Among them, the first three is for the character of the results of the classification method, wherein the third method is a logical discriminant, that is, the discriminant results of the output of the sample to determine whether the category is classified, and the latter two are for the numerical results of the classification of the variable.
In addition, kernel refers to the kernel functions used during model building. In order to improve the accuracy of the model, the kernel function is used to transform the original feature, the original feature dimension is improved, and the linear irreducible problem of support vector machine model is solved. The kernel parameter in the SVM () function has four optional kernel functions, namely, linear kernel function, polynomial kernel function, Gaussian kernel function and neural network kernel function. Among them, the Gaussian kernel function and the polynomial kernel function are considered to be the best and most commonly used kernel functions.
There are two main types of kernel functions: local kernel function and global kernel function, Gaussian kernel function is a typical local kernel function, and polynomial kernel function is a typical global kernel function. The local kernel function only has influence on the data point in the small field near the test point, its learning ability is strong, the generalization performance is weak, and the global kernel function is relatively strong generalization performance and weak learning ability.
For the selected kernel function, the degree parameter refers to the parameter in the kernel function polynomial inner product function, and its default value is 3. The gamma parameter gives the parameters of all functions except the linear inner product function in the kernel function, and the default value is L. The COEF0 parameter refers to the parameter in the kernel function in the polynomial inner product function and the sigmoid intrinsic product function, the default value is 0.
In addition, the parameter cost is the outlier weight in the soft interval model. Finally, the parameter nu is used for parameters in Nu-regression, nu-classification, and one-classification types.
An empirical conclusion is that using the SVM () function to build a support vector machine model results in a better model with standardized data.
According to the second use format of the function, the result variable and the characteristic variable should be extracted separately when the model is established for the above data. The result vector is represented by a vector, and the eigenvector is represented by a matrix. After determining the good data should also be based on the data analysis used by the kernel function and kernel functions corresponding to the parameter values, usually by default using the Gaussian intrinsic product function as a kernel function. A sample code is given below
When you build a model using the second format, you do not need to specifically emphasize the form of the model you are building, and the function automatically uses all the input feature variable data as the eigenvector needed to build the model. In the above procedure, the code used to determine the gamma coefficient of a kernel function is meant to be: if the eigenvector is a vector then the gamma value is L, otherwise the gamma value is the reciprocal of the number of eigenvectors.
After using the sample data to build the model, we can use the model to predict and discriminate accordingly. Based on the model established by the SVM () function, the function predict () can be used to accomplish the corresponding work. When using this function, you should first confirm the sample data that will be used for the prediction, and combine the feature variables of the sample data into the same matrix. Take a look at the sample code below.
It is often necessary to check the accuracy of the model predictions after the predictions are made, and you need to use the function table () to compare the predictions to the actual results. From the output of the above code, you can see that in the model prediction, the model predicts all irises belonging to the Setosa type to be correct, and the model will have 48 of the versicolor types of iris, but the other two are wrongly predicted as virginica types; The model correctly predicts 48 of the irises that belong to the Virginica type, but also predicts the other two incorrectly as the versicolor type.
An optional parameter in the function predict () is decision.values, and we also discuss the use of this parameter briefly here. By default, the default value for this parameter is false. If it is set to true, then the return vector of the function will contain a property named "Decision.values", which is a n*c matrix. Here, N is the amount of data being predicted, and C is the decision value of the two classifier. Note that because we use support vector machines to classify sample data, the classification results may be of k categories. Then there will be a two classifier between any two classes in the K category. So, we can deduce that the total number of two classifiers is K (k-1)/2. The column name in the decision value matrix is the label for the two category. Take a look at the sample code below.
As we are dealing with a classification problem. So the classification decision is ultimately through a sign (?) function to complete the. From the above output can be seen, for the sample data 4, the label Setosa/versicolor corresponding value is greater than 0, and therefore belong to the Setosa category, the label Setosa/virginica corresponding value is also greater than 0, so that the decision also belongs to Setosa , in the two classifier Versicolor/virginica the corresponding decision value is greater than 0, the determination belongs to Versicolor. Therefore, the final sample data 4 is determined to belong to Setosa. According to the same series, we can also judge sample 77 and sample 78 based on the symbol of the decision value, which belong to the versicolor and Virginica categories respectively.
In order to make further analysis of the model, the model can be displayed by visual means, and the sample code is shown below. As shown in result 14-15. It can be seen that the resulting image is a general observation of the model data category after visualizing the established support vector machine model through the plot () function. The "+" in the figure represents a support vector, and the circle represents a normal sample point.
> Plot (Cmdscale (Dist (iris[,-5)), + col = C ("Orange", "Blue", "green") [As.integer (iris[,5])],+ pch = C ("O", "+") [1:150 %in% Model3$index + 1]) > Legend (1.8, -0.8, C ("Setosa", "versicolor", "Virgincia"), + col = C ("Orange", "Blue", "green"), lty = 1)
As we can see in Figure 14-15, the first category of Setosa in Iris is significantly different from the other two, while the remaining versicolor categories and virginica categories differ very little, and even cross-section is difficult to distinguish. Note that this is still difficult to distinguish after using all four features. This also explains the problem in the model prediction process from another angle, so the model mistakenly predicts the flower of the 2 versicolor category into the Virginica category, and the flower of the 2 virginica category is wrongly predicted to be the versicolor category, which is normal phenomenon.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Data mining using support vector Machine (SVM) in R (bottom)