SVM (Support vector Machine) refers to support vector machines, which is a common discriminant method. In the field of machine learning, it is a supervised learning model, which is usually used for pattern recognition, classification and regression analysis.
MATLAB has Lin Zhiren written LIBSVM toolkit can be well carried out SVM training. Python We have the Sklearn Toolkit for Machine learning algorithm training, Scikit-learn Library has implemented all the basic machine learning algorithms.
The following reference is from the Https://www.cnblogs.com/luyaoblog/p/6775342.html blog and updates the Python2 code in the original text to Python3 code.
For more python and machine learning content, please visit omegaxyz.com
The following is an example of an iris orchid dataset:
As this is the case with the Iris raw dataset downloaded from the UCI database, the top four is listed as a feature column, and list Five as a category column, with three categories of Iris-setosa, Iris-versicolor, and Iris-virginica.
You need to use NumPy to split it.
DataSet Download Address: http://archive.ics.uci.edu/ml/machine-learning-databases/iris/
Download Iris.data.
Python3 Code:
From Sklearn import SVM import numpy as NP import Matplotlib.pyplot as PLT import matplotlib as MPL from matplotlib Import Colors from sklearn.model_selection import Train_test_split def iris_type (s): it = {B ' iris-setosa ': 0, B ' iris-versi Color ': 1, B ' Iris-virginica ': 2} return it[s] path = ' c:\\users\\dell\\desktop\\iris.data ' # data file path = Np.loadt XT (Path, dtype=float, delimiter= ', ', converters={4:iris_type}) x, y = np.split (data, (4,), Axis=1) x = x[:,: 2] x_train, X_test, y_train, y_test = Train_test_split (x, Y, random_state=1, train_size=0.6) # CLF = SVM. SVC (c=0.1, kernel= ' linear ', decision_function_shape= ' OVR ') CLF = SVM. SVC (c=0.8, kernel= ' RBF ', gamma=20, decision_function_shape= ' OVR ') clf.fit (X_train, Y_train.ravel ()) Print (Clf.score ( X_train, Y_train)) # Precision Y_hat = clf.predict (x_train) print (Clf.score (x_test, y_test)) Y_hat2 = Clf.predict (x_test) x1_m In, X1_max = x[:, 0].min (), x[:, 0].max () # No. 0 Column range x2_min, X2_max = x[:, 1].min (), x[:, 1].max () # 1th column range X1, x2 = np.mgrid[x1_min:x1_max:200j, x2_min:x2_max:200j] # Generate grid sampling point grid_test = Np.stack ((X1.flat, X2.flat), Axis=1) # test Point mpl.rcparams[' font.sans-serif '] = [u ' simhei '] mpl.rcparams[' axes.unicode_minus '] = False cm_light = mpl.colors.Listed ColorMap ([' #A0FFA0 ', ' #FFA0A0 ', ' #A0A0FF ']) Cm_dark = Mpl.colors.ListedColormap ([' G ', ' R ', ' B ']) Grid_hat = Clf.predict ( Grid_test) # Predictive Classification Value grid_hat = Grid_hat.reshape (x1.shape) # to the same shape as the input alpha = 0.5 Plt.pcolormesh (x1, x2, Grid_hat, CMap =cm_light) # Predictive value of the display Plt.plot (x[:, 0], x[:, 1], ' o ', Alpha=alpha, color= ' Blue ', markeredgecolor= ' K ') Plt.scatter (x_test[:, 0], x_test[:, 1], s=120, facecolors= ' None ', zorder=10) # Circle Test Set Sample Plt.xlabel (U ' calyx length ', fontsize=13) Plt.ylabel (U ' calyx width ', fo ntsize=13) Plt.xlim (x1_min, X1_max) Plt.ylim (x2_min, X2_max) plt.title (U ' SVM classification ', fontsize=15) plt.show ()
Split (data, split position, axis =1 (horizontal split) or 0 (vertical split)).
x = x[:,: 2] is for the convenience of late drawing more intuitive, so only take the first two columns of eigenvector training.
Sklearn.model_selection.train_test_split randomly divided training set and test set. Train_test_split (train_data,train_target,test_size= number, random_state=0)
Parameter explanation:
Train_data: The special collection of samples to be divided
Train_target: The sample results to be divided
Test_size: Sample ratio, if it's an integer, that's the number of samples.
Random_state: A random number of seeds.
Random number seed: In fact, the number of random numbers in the group, when the need to repeat the test, to ensure that a group of the same random number. For example, every time you fill in 1, the same random array you get is the same as the other parameters. But fill in 0 or not, each time will be different. The generation of random numbers depends on the seed, the relationship between the random number and the seed according to the following two rules: The seeds are different and produce different random numbers; the same seed, even if the instance is different, produces the same random number.
Kernel= ' linear ' when the linear core, C larger classification effect of the better, but it is possible to fit (Defaul c=1).
Kernel= ' RBF ' (default), for the Gaussian core, the smaller gamma value, the more continuous classification interface; The greater the gamma value, the more "dispersed" the classification interface, the better the classification effect, but it is possible to fit.
Linear classification Result:
RBF kernel function Classification Result:
For more python and machine learning content, please visit omegaxyz.com