Read the research on comparison and analysis of data mining classification algorithms based on Neural network Master of Engineering, Anhui University: Changkai (ii) Introduction to Datasets

Source: Internet
Author: User

Introduction to Data sets

1. "Abalone Age" dataset (Abalone data set). is to predict the life of abalone by predicting the rings of abalone, the rings of abalone. The data set comes from the UCI (University of California,irvine,uci) database for machine learning.

A total of eight properties were: gender, length, diameter, etc.

An introduction to specific properties

Method one: Using BP

Method using Elm

Method Three: Using SVM

Me: Through the Xmind function found that in fact, for a new method function is integrated can be directly used, we have to do is to know the specific meaning of each function, and know the approximate process. Understanding is the foundation of everything and the basis for our free use of functions

2. Introduction to "Whether there is heart disease" episode

(Statlog (Heart) Data Set) is to determine whether the interviewee has heart disease by studying the values of age, sex, blood pressure and other properties.

Characteristics of the specific attributes:

Chest Pain Chest Pain

Resting blood pressure resting blood pressure

Serum Cholestoral Serum Bile acid

Fasting blood sugar fasting blood glucose

Resting electrocardiographic results Resting ECG results

Maxinum heart rate achieved Max Heartbeat

Exercise induced angina exercise induced angina pectoris

Oldpeak

The slope of the peak exercise St segment the slope of the St segment during peak exercise

Number of major vessels vascular capacity

Thal Tal

Input: 13 attribute output: Yes 1, no 0

The three-way process is:

3. Introduction to "Cancer patient Survival" episode

(Haberman ' s survival Data Set '), is to determine the patient's survival status by the age of the aged patient, the year of operation, the number of three positive axillary lymph nodes detected.

The three properties were: The age of the patient's operation, the year of the patient's operation, the number of positive axillary lymph nodes detected

Patient's survival: 1 means the patient survived for five years or more, and 2 of them didn't live for 5 years.

Input: Three properties

Output: Two labels

4. "Wheat Seed Set" (Seed Data Set)

Determine the seed type by the physical characteristics of the different three wheat seeds (Kama, Rosa, Canadian)

Specific properties:

Perimeter Perimeter

Compactness Compact

Length of kernel cores

Width of kernel core width

Asymmetry coefficient asymmetry coefficient

Length of kernel groove grain length

Input: These attributes above

Output: It's the kind of discrimination that belongs.

5. "Does the Indians have diabetes"?

(Pima Indians Diabetes Data Set) is determined by studying the properties of eight numeric types and then by the corresponding conclusions.

The last part of the dataset is a categorized attribute: 0 means no diabetes; 1 indicates

Plasma glucose concentration A 2 hours in an oral glucose tolerance test

In oral glucose tolerance test, the plasma glucose concentration was 2 hours.

diastolic blood pressure diastolic pressure

Triceps skin fold thickness three head muscle skin pleat thickness

2-hours Serum Insulin 2 hours Serum insulin

Body mass Index body mass index

Diabetes Pedigree Function Diabetic Pedigree

6. "General Wine category"

(Wine Data Set) records the results of chemical composition analysis of three different varieties of wine in the same region of Italy.

The specific properties are:

Read the research on comparison and analysis of data mining classification algorithms based on Neural network Master of Engineering, Anhui University: Changkai (ii) Introduction to Datasets

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.