Python Support Vector Machine (SVM) instance __python

Source: Internet
Author: User
Tags svm rbf kernel

SVM (Support vector Machine) refers to support vector machines, which is a common discriminant method. In the field of machine learning, it is a supervised learning model, which is usually used for pattern recognition, classification and regression analysis.

MATLAB has Lin Zhiren written LIBSVM toolkit can be well carried out SVM training. Python We have the Sklearn Toolkit for Machine learning algorithm training, Scikit-learn Library has implemented all the basic machine learning algorithms.

The following reference is from the Https://www.cnblogs.com/luyaoblog/p/6775342.html blog and updates the Python2 code in the original text to Python3 code.
For more python and machine learning content, please visit omegaxyz.com
The following is an example of an iris orchid dataset:

As this is the case with the Iris raw dataset downloaded from the UCI database, the top four is listed as a feature column, and list Five as a category column, with three categories of Iris-setosa, Iris-versicolor, and Iris-virginica.

You need to use NumPy to split it.

DataSet Download Address: http://archive.ics.uci.edu/ml/machine-learning-databases/iris/

Download Iris.data.

Python3 Code:

From Sklearn import SVM import numpy as NP import Matplotlib.pyplot as PLT import matplotlib as MPL from matplotlib Import Colors from sklearn.model_selection import Train_test_split def iris_type (s): it = {B ' iris-setosa ': 0, B ' iris-versi Color ': 1, B ' Iris-virginica ': 2} return it[s] path = ' c:\\users\\dell\\desktop\\iris.data ' # data file path = Np.loadt  XT (Path, dtype=float, delimiter= ', ', converters={4:iris_type}) x, y = np.split (data, (4,), Axis=1) x = x[:,: 2] x_train, X_test, y_train, y_test = Train_test_split (x, Y, random_state=1, train_size=0.6) # CLF = SVM. SVC (c=0.1, kernel= ' linear ', decision_function_shape= ' OVR ') CLF = SVM. SVC (c=0.8, kernel= ' RBF ', gamma=20, decision_function_shape= ' OVR ') clf.fit (X_train, Y_train.ravel ()) Print (Clf.score ( X_train, Y_train)) # Precision Y_hat = clf.predict (x_train) print (Clf.score (x_test, y_test)) Y_hat2 = Clf.predict (x_test) x1_m In, X1_max = x[:, 0].min (), x[:, 0].max () # No. 0 Column range x2_min, X2_max = x[:, 1].min (), x[:, 1].max () # 1th column range X1, x2 = np.mgrid[x1_min:x1_max:200j, x2_min:x2_max:200j] # Generate grid sampling point grid_test = Np.stack ((X1.flat, X2.flat), Axis=1) # test Point mpl.rcparams[' font.sans-serif '] = [u ' simhei '] mpl.rcparams[' axes.unicode_minus '] = False cm_light = mpl.colors.Listed ColorMap ([' #A0FFA0 ', ' #FFA0A0 ', ' #A0A0FF ']) Cm_dark = Mpl.colors.ListedColormap ([' G ', ' R ', ' B ']) Grid_hat = Clf.predict ( Grid_test) # Predictive Classification Value grid_hat = Grid_hat.reshape (x1.shape) # to the same shape as the input alpha = 0.5 Plt.pcolormesh (x1, x2, Grid_hat, CMap  =cm_light) # Predictive value of the display Plt.plot (x[:, 0], x[:, 1], ' o ', Alpha=alpha, color= ' Blue ', markeredgecolor= ' K ') Plt.scatter (x_test[:, 0], x_test[:, 1], s=120, facecolors= ' None ', zorder=10) # Circle Test Set Sample Plt.xlabel (U ' calyx length ', fontsize=13) Plt.ylabel (U ' calyx width ', fo ntsize=13) Plt.xlim (x1_min, X1_max) Plt.ylim (x2_min, X2_max) plt.title (U ' SVM classification ', fontsize=15) plt.show ()

Split (data, split position, axis =1 (horizontal split) or 0 (vertical split)).

x = x[:,: 2] is for the convenience of late drawing more intuitive, so only take the first two columns of eigenvector training.

Sklearn.model_selection.train_test_split randomly divided training set and test set. Train_test_split (train_data,train_target,test_size= number, random_state=0)

Parameter explanation:

Train_data: The special collection of samples to be divided

Train_target: The sample results to be divided

Test_size: Sample ratio, if it's an integer, that's the number of samples.

Random_state: A random number of seeds.

Random number seed: In fact, the number of random numbers in the group, when the need to repeat the test, to ensure that a group of the same random number. For example, every time you fill in 1, the same random array you get is the same as the other parameters. But fill in 0 or not, each time will be different. The generation of random numbers depends on the seed, the relationship between the random number and the seed according to the following two rules: The seeds are different and produce different random numbers; the same seed, even if the instance is different, produces the same random number.

Kernel= ' linear ' when the linear core, C larger classification effect of the better, but it is possible to fit (Defaul c=1).
Kernel= ' RBF ' (default), for the Gaussian core, the smaller gamma value, the more continuous classification interface; The greater the gamma value, the more "dispersed" the classification interface, the better the classification effect, but it is possible to fit.

Linear classification Result:

RBF kernel function Classification Result:

For more python and machine learning content, please visit omegaxyz.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.