KNN Combat (i)

Source: Internet
Author: User
Tags ranges

These two days have been busy with the game and a variety of training, and today is also a time to read books, the tone of the program, my little heart spicy excitement ah. To get to the point, before, a simple verification of KNN, today we use KNN to improve the effect of dating sites, personal understanding, this problem can also be translated into other such as the various sites to cater to customer preferences, such as recommendations, of course, today's example of the function is really limited.

In this case, according to the date data collected by a person, according to the main sample characteristics and the resulting classification, some unknown categories of data classification, roughly.

I am using Python 3.4.3, first create a file, such as date.py, the specific code is as follows:

#Coding:utf-8 fromNumPyImport*Importoperator fromCollectionsImportCounterImportmatplotlibImportMatplotlib.pyplot as Plt## #导入特征数据defFile2matrix (filename): Fr=open (filename) contain= Fr.readlines ()## #读取文件的所有内容Count =len (contain) Returnmat= Zeros ((count,3)) Classlabelvector=[] Index=0 forLineinchContain:line= Line.strip ()## #截取所有的回车字符Listfromline = Line.split ('\ t') Returnmat[index,:]= Listfromline[0:3]## #选取前三个元素, stored in the feature matrixClasslabelvector.append (Listfromline[-1])## #将列表的最后一列存储到向量classLabelVector中Index + = 1##将列表的最后一列由字符串转化为数字 for future calculationsDictclasslabel =Counter (classlabelvector) Classlabel=[] Kind=list (Dictclasslabel) forIteminchClasslabelvector:ifitem = =kind[0]: item= 1elifitem = = Kind[1]: Item= 2Else: Item= 3classlabel.append (item)returnReturnmat,classlabel## # #将文本中的数据导入到列表##绘图 (can visually indicate the degree of influence of each feature on the classification result)Datingdatamat,datinglabels = File2matrix ('D:\python\Mechine learing in Action\knn\datingtestset.txt') FIG=plt.figure () Ax= Fig.add_subplot (111) Ax.scatter (datingdatamat[:,0],datingdatamat[:,1],15.0*array (datinglabels), 15.0*Array (datinglabels)) plt.show ()## normalized data, guaranteed features and other weightsdefAutonorm (dataSet): Minvals=dataset.min (0) maxvals=Dataset.max (0) ranges= Maxvals-minvals Normdataset= Zeros (Shape (dataSet))##建立与dataSet结构一样的矩阵m =Dataset.shape[0] forIinchRange (1, M): Normdataset[i,:]= (Dataset[i,:]-minvals)/Rangesreturnnormdataset,ranges,minvals##KNN算法defclassify (input,dataset,label,k): DataSize=Dataset.shape[0]## # #计算欧式距离diff = Tile (input, (datasize,1))-DataSet Sqdiff= diff * * 2squaredist= SUM (Sqdiff,axis = 1)## #行向量分别相加, thus getting a new line of vectorsDIST = squaredist * * 0.5##对距离进行排序Sorteddistindex = Argsort (Dist)##argsort () sorts the elements from large to small based on their values, returning the subscriptClassCount={}     forIinchRange (k): Votelabel=Label[sorteddistindex[i]]## #对选取的K个样本所属的类别个数进行统计Classcount[votelabel] = Classcount.get (votelabel,0) + 1## #选取出现的类别次数最多的类别MaxCount =0 forKey,valueinchClasscount.items ():ifValue >Maxcount:maxcount=Value Classes=KeyreturnClasses##测试 (choose 10% test)defdatingtest (): rate= 0.10Datingdatamat,datinglabels= File2matrix ('D:\python\Mechine learing in Action\knn\datingtestset.txt') Normmat,ranges,minvals=autonorm (Datingdatamat) m=Normmat.shape[0] Testnum= Int (M *Rate ) Errorcount= 0.0 forIinchRange (1, Testnum): Classifyresult= Classify (normmat[i,:],normmat[testnum:m,:],datinglabels[testnum:m],3)        Print("the results of the classification are:,", Classifyresult)Print("The original result is:", Datinglabels[i])if(Classifyresult! =Datinglabels[i]): Errorcount+ = 1.0Print("the rate of error is:", (errorcount/float (testnum))) ## #预测函数defClassifyperson (): Resultlist= ['I don't like it at all.','There was a loss like','Grey often likes'] Percenttats= Float (Input ("How much time does it take to play video?")) Miles= Float (Input ("how many frequent flyer miles are earned each year?")) Icecream= Float (Input ("What is the number of ice cream litres consumed per week?")) Datingdatamat,datinglabels= File2matrix ('D:\python\Mechine learing in Action\knn\datingtestset2.txt') Normmat,ranges,minvals=autonorm (datingdatamat) Inarr=Array ([Miles,percenttats,icecream]) Classifierresult= Classify ((inarr-minvals)/ranges,normmat,datinglabels,3)    Print("How much you like this person:", Resultlist[classifierresult-1])

Create a new test.py file to understand the program's running results, code:

#Coding:utf-8 fromNumPyImport*Importoperator fromCollectionsImportCounterImportmatplotlibImportMatplotlib.pyplot as PltImportSyssys.path.append ("D:\python\Mechine learing in Action\knn")ImportDatedate.classifyperson ()

Running results such as:

This is a success, of course, in which also encountered a lot of problems, I see the "Machine learning Combat" This book is mainly aimed at python2.7, rightfully I met a lot of problems. There are friends who are studying this book we can exchange the next ha ~

Finally still want to say a bit, is in the code, the time to complete a paragraph or debugging a better, so that in the end there will be no big problem, this time I do so

—————————————————————————————— I am the ———————————————————————— of the separation line

Recently still a word: "Busy"! Summer vacation, think also really bitter oneself, Father hemp call all urge home, it seems Chou still really miss me. At that time after the end of the final exam to plan the summer vacation should be how to live, now more than half past, indeed there is no regret. In fact, sometimes think, a distant dream is not worth insisting, my action told me I insist, although the results are not known, but time will prove everything.

I am never pessimistic, I believe that no pains, more believe that the Labor has been won!

I like the life is: should eat, should play, the busy still have to be busy ~

A little faster, you have to get up early in the morning to class, rest early, everyone goodnight ~

KNN Combat (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.