Machine Learning--kdtree Practice

Source: Internet
Author: User

One, KDTREE data structure node

    • Left: Zuozi
    • Right: Subtree
    • FEA: Selected axes (features)
    • DataNode: Sample at midpoint of selected axis

Second, the Kdtree realization mainly consists of two parts:

    • 1, Achievements: Calculate the Axis variance, select the most variance axis, recursive two points
    • 2, query: According to the value of the current kdtree node axis and to query the node axis value comparison, choose to Zuozi (or right subtree) recursive query, get two points between Zuozi (or right subtree) of the minimum distance dis, according to the current kdtree node axis value and to query the difference between the axis of the node, if the difference is large, (whether the hyper-sphere and the super-rectangular delivery) is to backtrack to the right subtree (or Zuozi)

Third, the code implementation

1 #-*-coding:utf-8-*-2 """3 Created on Sun Sep 12:44:51 20184 5 @author: Administrator6 """7 ImportPandas as PD8 ImportNumPy as NP9 ImportMathTen #define TreeNode One classNode: A     def __init__(Self,ltree,rtree,fea,datanode):#FEA Represents the selected axis, and DataNode splits the left and right subtree with that node -self.left=Ltree; -self.right=RTree; theSelf.fea=FEA; -Self.datanode=datanode#tags are included, -  -  + ##直接用 DataFrame as a data structure - defgetInfo (): +data=[[2,3,'Sheep'],[5,4,'Monkey'],[9,6,'Chicken'],[4,7,'Dog'],[8,1,'Pig'],[7,2,'Monkey']];  ADATA=PD. DataFrame (data,columns=['fea1','fea2','label']) at     returndata; -  - #Calculate Variance, select axis according to axis variance - defcalsq (data): -sq=Data.var ();  -pos=Data.columns[0]; inVal=sq[0]; -      forIinchDATA.COLUMNS[1:-1]:#Select the most variance to         if(val<Sq[i]): +Val=Sq[i]; -pos=i; the     returnPos; *  $  #split the data by AxisPanax Notoginseng defSplitaxis (data): -Fea=calsq (data); theSortdata=data.sort_values (BY=FEA);#Sort by axes +Sortdata= (Np.array (SortData)). ToList ();#go to list ADATANODE=PD. DataFrame ([Sortdata[len (SortData)//2]], columns=list (data.columns));#Data Node theLEFTSET=PD. DataFrame (Sortdata[0:len (SortData)//2], columns=list (data.columns));#left dial hand tree +RIGHTSET=PD. DataFrame (Sortdata[len (SortData)//2+1:], Columns=list (data.columns));#Right sub-tree -     returnFea,datanode,leftset,rightset; $  $ #Achievements - defCreatetree (data):#Recursive achievements -     if(Len (data) >0):#If you have data thefea,datanode,leftset,rightset=Splitaxis (data) -Treenode=Node (none,none,fea,datanode);Wuyi         if(Len (Leftset) >0):#whether the left can be divided thetreenode.left=Createtree (leftset); -         if(Len (Rightset) >0):#whether the right can be divided Wutreenode.right=Createtree (rightset); -         returnTreeNode; About    $ #Recursive Search - defSearch (Tree,prenode):#Pernode means to query a sample; -dis=0; -      forIinchTREE.DATANODE.COLUMNS[:-1]:#Calculate Distance ADis=dis+ (tree.datanode[i][0]-prenode[i][0]) **2; +dis=math.sqrt (dis); theLABEL=TREE.DATANODE[TREE.DATANODE.COLUMNS[-1]][0];#Current node Tag -Labell="'; $Labelr="'; the     if(Tree.left!=none andPrenode[tree.fea][0] < tree.datanode[tree.fea][0]):#Search left theDisl,labell =Search (Tree.left, prenode); the         if(Disl<dis):#take the smallest distance thedis=DisL -Label=Labell; in         if(Dis > abs (prenode[tree.fea][0]-tree.datanode[tree.fea][0]):#Whether the hyper-sphere and the super-rectangular delivery determine whether to backtrack theDishr,labelhr=search (Tree.right,prenode);#Backtracking right subtree the             if(dishr<dis): About                 returnDISHR,LABELHR the             Else: the                 returnDis,label the          +     if(Tree.right!=none andPrenode[tree.fea][0] >= tree.datanode[tree.fea][0]):#Search Right -Disr,labelr=search (tree.right,prenode); the         if(DisR < dis):#take the smallest distanceBayidis=DisR; theLabel=LABELR; the         if(Dis > abs (prenode[tree.fea][0]-tree.datanode[tree.fea][0]):#Whether the hyper-sphere and the super-rectangular delivery determine whether to backtrack -Dishl,labelhl=search (Tree.left,prenode);#Backtrack left subtree -             if(dishl<dis): the                 returnDishl,labelhl the             Else: the                 returnDis,label the     returnDis,label; -      theData=getInfo (); theroot=createtree (data); theTEST=PD. DataFrame ([[7.1,1]], columns=list (data.columns[:-1]));94Dis,label=search (Root,test)

Machine Learning--kdtree Practice

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.