Machine Learning UCI database

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Http://archive.ics.uci.edu/ml/

The database is a machine learning database proposed by the University of California at the University of Virginia (universityofcaliforniairvine). There are currently 187 datasets in this database, and the number of these databases is increasing, UCI dataset is a common standard test dataset.

The "multiplefeatures" database on UCI is a handwritten digital recognition problem. The digital image of each number is represented by 649 features in six groups.

UCI data can be read using MATLAB's dlmread (or textread or using MATLAB's imported data). However, you need to replace a number with a category other than a number, such as 1/2/3, otherwise, the data cannot be read.

Each data file (*. Data) contains records of many individual samples described in the form of "property-value" pairs. The *. info file contains a large amount of documents. (Some files _ generate _
Databases; they do not contain *. Data Files .) As a supplement to datasets and domain knowledge, the utilities directory contains useful information for using this dataset.

The following uses Iris in UCI as an example to describe the dataset:

Ucidata \ Iris has three files:

Index

Iris. Data

Iris. Names

Index is a folder directory that lists all the files in this folder. For example, the index content in Iris is as follows:

Index of Iris

18 Mar 1996 105 Index

08 mar 1993 4551 Iris. Data

30 May 1989 2604 Iris. Names

Iris. Data is an IRIS data file with the following content:

5.1, 3.5, 1.4, 0.2, iris-setosa

4.9, 3.0, 1.4, 0.2, iris-setosa

4.7, 3.2, 1.3, 0.2, iris-setosa

......

7.0, 3.2, 4.7, 1.4, iris-versicolor

6.9, 3.1, 4.9, 1.5, iris-versicolor

......

6.3, 3.3, 6.0, 2.5, iris-virginica

6.4, 3.2, 4.5, 1.5, iris-versicolor

5.8, 2.7, 5.1, 1.9, iris-virginica

7.1, 3.0, 5.9, 2.1, iris-virginica

......

As shown above, the attributes are directly separated by commas, and there is no space (5.1, 3.5, 1.4, 0.2,) in the middle. The last column is the value corresponding to the row attribute, that is, the decision attribute iris-setosa.

Iris. Names describes some information about the irir data, such as the data title, data source, previous usage, recent information, number of instances, and instance attributes:

......

7. Attribute Information:

1. sepal length in cm

2. sepal width in cm

3. petal length in cm

4. petal width in cm

5. class:

-- Iris setosa

-- Iris versicolour

-- Iris virginica

......

9. Class distribution: 33.3% for each of 3 classes.

For examples of using this data, please refer to other papers or content later on this site.

The following uses wine data as an example to import MATLAB and uses the libsvm mentioned above for testing.

> Uiimport ('Wine. data ')

Import data. The wine array 178*14 appears at the workspace.

Extract tags and data attributes and save them to the data on the MATLAB platform.

> Wine_label = wine (:, 1 );

> Wine_data = wine (:, 2: End );

> Save winedat. Mat

(You can directly> load winedat next time)

Obtain the wine model from the SVM training model.

> Modelw = svmtrain (wine_label, wine_data );

Optimization finished, # iter = 239.

Nu = 0.892184.

OBJ =-61.125695, Rn = 0.131965

NSv = 130, nbsv = 53

Optimization finished, # iter = 193.

Nu = 0.882853.

OBJ =-50.421538, fig =-0.166754

NSv = 107, nbsv = 42

Optimization finished, # iter = 214.

Nu = 0.800233.

OBJ =-53.411663, fig =-0.286931

NSv = 119, nbsv = 44

Total nSv = 178

Classification Result

> [Plabelw, accuracyw] = svmpredict (wine_label, wine_data, modelw );

Accuracy = 100% (178/178) (classification)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning UCI database

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning UCI database

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support