SVM-based data classification prediction Italian wine category Recognition

Source: Internet
Author: User
Tags svm
Wine data comes from the UCI database and records the chemical composition of wine 13 of different varieties in the same region of Italy, so as to achieve automatic wine Classification through scientific methods. The data of this classification has a total of 178 samples, each of which has 13 attributes, and provides a correct classification for each sample, which is used to verify the accuracy of SVM classification. First, I

Wine data comes from the UCI database and records the chemical composition of wine 13 of different varieties in the same region of Italy, so as to achieve automatic wine Classification through scientific methods. The data of this classification has a total of 178 samples, each of which has 13 attributes, and provides a correct classification for each sample, which is used to verify the accuracy of SVM classification. First, I

Wine data comes from the UCI database and records the chemical composition of wine 13 of different varieties in the same region of Italy, so as to achieve automatic wine Classification through scientific methods.

The data of this classification has a total of 178 samples, each of which has 13 attributes, and provides a correct classification for each sample, which is used to verify the accuracy of SVM classification.

First, we can draw a data visualization diagram:

% Load Test Data wine, including the data in the matrix of classnumber = 3, wine: 178*13, and the column vector load chapter_WineClass.mat of wine_labes: 178*1; % plot the box visualization map of test data figure; boxplot (wine, 'orientation', 'horizontal ', 'labels', categories); title ('box visualization map of wine Data ', 'fontsize', 12); xlabel ('attribute value ', 'fontsize', 12); grid on; % plot the dimension chart figuresubplot (, 1) of the test data ); hold onfor run = 1: 178 plot (run, wine_labels (run), '*'); endxlabel ('samples', 'fontsize', 10 ); ylabel ('category label', 'fontsize', 10); title ('class', 'fontsize', 10); for run = subplot (, run ); hold on; str = ['B B', num2str (run-1)]; for I = 1: 178 plot (I, wine (I, run-1 ), '*'); end xlabel ('samples', 'fontsize', 10); ylabel ('Property value', 'fontsize', 10); title (str, 'fontsize ', 10); end

()

()

It is a box visualization graph of wine data and a box graph of wine data. It is difficult to tell which type of wine each type is. Next we will try to use SVM for classification.

Data preprocessing

% Selected training set and Test Set % use 1-30 of the first class, 60-95 of the second class, And 131-153 of the third class as the training set train_wine = [wine ,:); wine (60: 95, :); wine (131: 153, :)]; % the labels of the corresponding training set must also be separated. train_wine_labels = [wine_labels (); wine_labels (60: 95); wine_labels (131: 153)]; % convert 31-59 of the First Class, 96-130 of the second class, test_wine = [wine (31: 59, :); wine (96: 130, :); wine (154: 178, :)]; % The labels of the corresponding test set should also be separated. test_wine_labels = [wine_labels (31: 59); wine_labels (96: 130); wine_labels (154: 178)];% Data preprocessing% Data preprocessing: normalize the training set and test set to the [0, 1] interval [mtrain, ntrain] = size (train_wine); [mtest, ntest] = size (test_wine ); dataset = [train_wine; test_wine]; % mapminmax is the built-in normalization function of MATLAB [dataset_scale, ps] = mapminmax (dataset ', 0, 1); dataset_scale = dataset_scale '; train_wine = dataset_scale (1: mtrain, :); test_wine = dataset_scale (mtrain + 1) :( mtrain + mtest ),:);
SVM network creation, training, and Prediction
% SVM network training model = svmtrain (train_wine_labels, train_wine, '-c 2-g 1'); % SVM network prediction [predict_label, accuracy, dec_value1] = svmpredict (test_wine_labels, test_wine, model );
Result Analysis
% Result Analysis % actual classification and prediction classification chart of the test set % the chart shows that only one test sample is the correct figure; hold on; plot (test_wine_labels, 'o'); plot (predict_label, 'r * '); xlabel ('test set samples', 'fontsize', 12); ylabel ('category label ', 'fontsize', 12); legend ('actual Test Set category', 'prediction Test Set category'); title ('actual classification of Test Set and prediction category ', 'fontsize', 12); grid on;

The svm classification accuracy reaches 98.8764%, and only one of the 89 test samples is incorrectly classified. It can be seen that SVM is powerful in data classification!

END

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.