Ubuntu Machine Learning Python Combat (a) K-Nearest neighbor algorithm

Source: Internet
Author: User

2018.4.18Python machine learning record one. Ubuntu14.04 installation numpy1. Reference URL 2. Installation code:

It is recommended to update the software source before installing:

sudo apt-get update

If Python 2.7 is not a problem, you can proceed to the next step.
The packages for numeric calculations and drawings are now installed and Sklearn are numpy scipy matplotlib Pandas and Sklearn, respectively
The Apt-get command is as follows

sudo apt-get install python-numpysudo apt-get install python-scipysudo apt-get install python-matplotlibsudo apt-get install python-pandassudo apt-get install python-sklearn
3. Testing

Test whether all the installation is successful, open the Python interpreter, enter the following command, if there is no error, then success.

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom sklearn import datasets,linear_model
4.Ubuntu writing and running Python programs

(1) Vim creation hello.py code: print ' hello,welcome to Linux python '
(2) Enter the directory where the program is located: CD python
(3) running the program Python hello.py

Two. NumPy Function Library Basics

In the Python Shell development environment, enter the following command:

>>>from numpy Import * #将numpy函数库中的所有模块引入当前的命名空间 >>> random.rand (bis) #构造一个4 a random array of arrays ([[ 0.97134166, 0.69816709, 0.35251331, 0.32252662], [0.40798608, 0.48113781, 0.67629943, 0.12288183], [0.  96055063, 0.85824686, 0.95458472, 0.40213735], [0.28604852, 0.43380204, 0.2558164, 0.07954809]]) >>> Randmat=mat (Random.rand (bis)) #调用mat函数将数组转换为矩阵 >>> randmat.i #. I implementation matrix inverse matrix ([[1.12580852,-0.43470821, 2.71229992,-2.16829781], [-1.4600302, 1.65644197,-1.3742097, 1.6 297217], [3.379582, 0.40573689, 0.84634018,-2.72232677], [-3.35086377,-2.64978047,-1.39459215, 4.6  8277082]]) >>> invarandmat=randmat.i #存储逆矩阵 >>> randmat*invarandmat #矩阵乘法 Generating Unit matrix matrices ([[    1.00000000e+00, 0.00000000e+00, 0.00000000e+00, 2.22044605e-16], [ -2.22044605e-16, 1.00000000e+00, 1.24900090e-16, 2.49800181e-16], [ -2.22044605e-16, -1.11022302e-16, 1.00000000e+00, 2.22044605e-16], [ -4.44089210e-16, -2.22044605e-16, -2.22044605e-16, 1.00000000e+00 ]]) >>> myeye=randmat*invarandmat #>>> Myeye-eye (4) #求误差值, eye (4) generates 4*4 unit matrix matrices ([[ -4.44089210e-16 , 0.00000000e+00, 0.00000000e+00, 2.22044605e-16], [ -2.22044605e-16, -1.11022302e-16, 1.24900090e        -16, 2.49800181e-16], [ -2.22044605e-16, -1.11022302e-16, 0.00000000e+00, 2.22044605e-16], [ -4.44089210e-16, -2.22044605e-16, -2.22044605e-16, 4.44089210e-16]])
Three. K-Nearest Neighbor Algorithm combat 1. Prepare: Import data using Python

Vim Creation knn.py:

from numpy import *  #导入运算模块import operator   def createDataSet():      group=array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])    labels=[‘A‘,‘A‘,‘B‘,‘B‘]    return group,labels
2. Enter the python development environment test
>>> import kNNTraceback (most recent call last):  File "<stdin>", line 1, in <module>ImportError: No module named kNN解决:需进入到kNN.py存储路径,然后在终端输入Python(1)[email protected]:~$ cd python #我的保存路径[email protected]:~/python$ python>>> import kNN  >>> group,labels=kNN.createDataSet()>>> grouparray([[ 1. ,  1.1],       [ 1. ,  1. ],       [ 0. ,  0. ],       [ 0. ,  0.1]])>>> labels[‘A‘, ‘A‘, ‘B‘, ‘B‘]
3. Implement KNN classification algorithm

Add a function classsify0 () to the above base

from numpy import *import operator   def createDataSet():    group=array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])    labels=[‘A‘,‘A‘,‘B‘,‘B‘]    return group,labels    #inx分类的输入向量 dataSet输入的训练样本集,labels标签向量,k最近邻数目def classify0(inx,dataSet,labels,k):     #numpy函数shape[0]返回dataSet的行数    dataSetSize =dataSet.shape[0]    #在列向量方向上重复inX共1次(横向),行向量方向上重复inX共dataSetSize次(纵向)    diffMat=tile(inx,(dataSetSize,1))-dataSet  #错写成了dataset    sqDiffMat=diffMat**2    sqDistances =sqDiffMat.sum(axis=1)    distances=sqDistances**0.5****    sortedDistIndicies=distances.argsort()    classCount={}    for i in range(k):        voteIlabel=labels[sortedDistIndicies[i]]        classCount[voteIlabel]=classCount.get(voteIlabel,0)+1    sortedClassCount=sorted(classCount.iteritems(),    key=operator.itemgetter(1),reverse=True)   #错写成了true    return sortedClassCount[0][0]

Code Supplement Explanation

numpy.tile()比如 a = np.array([0,1,2]), np.tile(a,(2,1))就是把a先沿x轴(就这样称呼吧)复制1倍,即没有复制,仍然是 [0,1,2]。再把结果沿y方向复制2倍,即最终得到 array([[0,1,2],       [0,1,2]])

Similarly:

>>> b = np.array([[1, 2], [3, 4]])>>> np.tile(b, 2) #沿X轴复制2倍array([[1, 2, 1, 2],       [3, 4, 3, 4]])>>> np.tile(b, (2, 1))#沿X轴复制1倍(相当于没有复制),再沿Y轴复制2倍array([[1, 2],       [3, 4],       [1, 2],       [3, 4]])

Test:

>>> import kNN>>> group,labels=kNN.createDataSet()>>> kNN.classify0([0,0],group,labels,3)
‘B‘

Ubuntu Machine Learning Python Combat (a) K-Nearest neighbor algorithm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.