Machine learning Combat--the second chapter of improving the pairing effect of dating sites

Source: Internet
Author: User
Tags ranges

Three types: dislike-1, charm-2, attractive-3.

Sample characteristics: The number of frequent flyer miles earned per year, the percentage of time spent playing video games, and the number of ice cream litres consumed per week.

1  fromNumPyImport*2 Importmatplotlib3 ImportMatplotlib.pyplot as Plt4 5 " "6 input text file name string, output training sample matrix and class label vector7 " "8 defFile2matrix (filename):9FR =open (filename)TenArrayolines =fr.readlines () #一次读取整个文件, parses the contents of a file into a list of rows  OneNumberOfLines =Len (arrayolines) AReturnmat = Zeros ((NumberOfLines, 3)) -Classlabelvector = [] -index =0 the      forLineinchArrayolines: -line = Line.strip ()#intercept all carriage return characters. -Listfromline = Line.split ('\ t')#split Line into an element list -Returnmat[index,:] = Listfromline[0:3]#Select the first 3 elements to store in the feature matrix +Classlabelvector.append (int (listfromline[-1]))#Select the last element to store -Index + = 1 +     returnReturnmat, Classlabelvector A  at if __name__=='__main__': -Datingdatamat, datinglabels = File2matrix ('DatingTestSet2.txt') -Fig =plt.figure () -Ax = Fig.add_subplot (111) -Ax.scatter (datingdatamat[:,0],datingdatamat[:,1], 20.0*array (datinglabels), 15.0*Array (datinglabels)) -Plt.show ()

Import NumPy as NP>>> Np.zeros ((3,2)) #3行2列的零矩阵Array ([[0.,  0.],       [0.,  0.],       [0.,  0.]])
Ax.scatter (datingdatamat[:,0],datingdatamat[:,1], 20.0*array (datinglabels), 15.0*array (DatingLabels))#  Scatter (x, y, size, color)

" "normalized eigenvalue newvalue = (oldvalue-min)/(max-min)" "defAutonorm (dataSet): Minvals=dataset.min (0) maxvals=Dataset.max (0) ranges= Maxvals-minvals Normdataset=zeros (Shape (dataSet)) m=Dataset.shape[0] Normdataset= Dataset-tile (Minvals, (M, 1))#Use the tile function to copy the contents of the variable into the same size as the input matrixNormdataset = Normdataset/tile (ranges, (M, 1))    returnnormdataset, ranges, minvalsif __name__=='__main__': Datingdatamat, Datinglabels= File2matrix ('DatingTestSet2.txt') Normmat, ranges, minvals=autonorm (datingdatamat) FIG=plt.figure () Ax= Fig.add_subplot (111) Ax.scatter (normmat[:,0],normmat[:,1], 30.0*array (datinglabels), 100.0*Array (datinglabels)) Plt.xlabel (U'number of frequent flyer miles earned per year') Plt.ylabel (U'percentage of time spent playing video games') plt.show ()

1 " "2 test the code. 3 estimate the correct rate of the algorithm: provide 90% of the existing sample as a training sample, and use the remaining 10% data to test the classifier. 4 Error Rate = Number of error results/Total number of test data5 " "6 defdatingclasstest ():7HoRatio = 0.108Datingdatamat, datinglabels = File2matrix ('DatingTestSet2.txt')9Normmat, ranges, minvals =autonorm (Datingdatamat)Tenm =Normmat.shape[0] Onenumtestvecs = Int (M * hoRatio)#10% of the sample count is used for testing AErrorcount =0 -      forIinchRange (numtestvecs): -Classifierresult =classify0 (Normmat[i,:], normmat[numtestvecs:m,:], theDATINGLABELS[NUMTESTVECS:M], 20) -         Print "The classifier came back with:%d, the real answer is:%d"  -%(Classifierresult, datinglabels[i]) -         if(Classifierresult! = Datinglabels[i]): Errorcount + = 1.0 +     Print "The total error rate is:%f"% (Errorcount/float (numtestvecs)) #输出错误率 -  + if __name__=='__main__': ADatingclasstest ()

1 " "2 Enter a person's information and give a forecast of their preferred level3 " "4 defClassifyperson ():5Resultlist = [' not at all','In small doses','In large doses']6Percenttats = Float (raw_input ("percentage of time spend playing video games?"))7Ffmiles = Float (Raw_input("frequent flier miles earned per year?"))8Icecream = Float (raw_input ("liters of ice cream consumed per year?"))9Datingdatamat, datinglabels = File2matrix ('DatingTestSet2.txt')TenNormmat, ranges, minvals =autonorm (Datingdatamat) OneInarr =Array ([Ffmiles, Percenttats, icecream]) AClassifierresult = Classify0 ((inarr-minvals)/ranges, Normmat, Datinglabels, 3) -     Print "You'll probably like this person :", resultlist[classifierresult-1] -  the if __name__=='__main__': -Classifyperson ()

Test:

1 percentage of time spend playing video games?102 frequent flier miles earned per year?100003 Liters of ice cream consumed per year?0.54 You'll probably like this person: in   small doses

Machine learning Combat--the second chapter of improving the pairing effect of dating sites

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.