R language learning-KNN near algorithm

Source: Internet
Author: User

Concept

1, supervised learning: from the given label training data to learn a function, according to this function for new data labeling.

2. Unsupervised Learning: Learn a function from a given non-annotated training data, labeling all data according to this function.

KNN classification algorithm: by analyzing the training data set of known classes, the classification rules are found, and the classification algorithm is the type of supervised learning.

KNN concept:

1. Training set: Data used to train the model or determine the parameters of the model.

2. Test set: Data used to verify the accuracy of the model.

3, cross-validation: The general use of 70% of the data as a training set, the remaining 30% of the data as a test set, test sets of testing results using a cross-table Form to verify.

Sampling method

Sample (X,size,replace=false)

X-Sample Samples

Number of size-samples

replace-whether the sample can be put back, default to False

KNN method -Need to install package class

Installation method: Install.packages ("class")

KNN (train,test,cl,k=1)

train-Training Data

test-test Data

Correct results of cl-training data

K value in K-knn, default is 1, can be set by itself until cross check to the best result

Example:

#install. Packages ("Class"), library (Class) #https://en.wikipedia.org/wiki/iris_flower_data_set#https:// zh.wikipedia.org/wiki/%e5%ae%89%e5%be%b7%e6%a3%ae%e9%b8%a2%e5%b0%be%e8%8a%b1%e5%8d%89%e6%95%b0%e6%8d%ae%e9%9b% 86# calculates the number of lines of iris, and ISIS is an example of Class pack total <-nrow (IRIS); Number of lines #抽样获取70% index <-sample (1:total, total*0.7) # Get training set According to sampling number Iris.train <-Iris[index,] #根据抽样编号获取测试集,-index is the equivalent of 30%iris.test <-Iris[-index,]# Using the KNN method to pair the function based on the training set and test set #subset () is equivalent to deleting the data result of the species column (Result column) in the training set and test set. KNN <-KNN (Train=subset (Iris.train, select=-species), Test=subset (iris.test,select=-species), cl=iris.train$ Species, k=2) #进行交叉检验table (iris.test$species, result. KNN)

  

R language learning-KNN near algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.