One, KDTREE data structure node
- Left: Zuozi
- Right: Subtree
- FEA: Selected axes (features)
- DataNode: Sample at midpoint of selected axis
Second, the Kdtree realization mainly consists of two parts:
- 1, Achievements: Calculate the Axis variance, select the most variance axis, recursive two points
- 2, query: According to the value of the current kdtree node axis and to query the node axis value comparison, choose to Zuozi (or right subtree) recursive query, get two points between Zuozi (or right subtree) of the minimum distance dis, according to the current kdtree node axis value and to query the difference between the axis of the node, if the difference is large, (whether the hyper-sphere and the super-rectangular delivery) is to backtrack to the right subtree (or Zuozi)
Third, the code implementation
1 #-*-coding:utf-8-*-2 """3 Created on Sun Sep 12:44:51 20184 5 @author: Administrator6 """7 ImportPandas as PD8 ImportNumPy as NP9 ImportMathTen #define TreeNode One classNode: A def __init__(Self,ltree,rtree,fea,datanode):#FEA Represents the selected axis, and DataNode splits the left and right subtree with that node -self.left=Ltree; -self.right=RTree; theSelf.fea=FEA; -Self.datanode=datanode#tags are included, - - + ##直接用 DataFrame as a data structure - defgetInfo (): +data=[[2,3,'Sheep'],[5,4,'Monkey'],[9,6,'Chicken'],[4,7,'Dog'],[8,1,'Pig'],[7,2,'Monkey']]; ADATA=PD. DataFrame (data,columns=['fea1','fea2','label']) at returndata; - - #Calculate Variance, select axis according to axis variance - defcalsq (data): -sq=Data.var (); -pos=Data.columns[0]; inVal=sq[0]; - forIinchDATA.COLUMNS[1:-1]:#Select the most variance to if(val<Sq[i]): +Val=Sq[i]; -pos=i; the returnPos; * $ #split the data by AxisPanax Notoginseng defSplitaxis (data): -Fea=calsq (data); theSortdata=data.sort_values (BY=FEA);#Sort by axes +Sortdata= (Np.array (SortData)). ToList ();#go to list ADATANODE=PD. DataFrame ([Sortdata[len (SortData)//2]], columns=list (data.columns));#Data Node theLEFTSET=PD. DataFrame (Sortdata[0:len (SortData)//2], columns=list (data.columns));#left dial hand tree +RIGHTSET=PD. DataFrame (Sortdata[len (SortData)//2+1:], Columns=list (data.columns));#Right sub-tree - returnFea,datanode,leftset,rightset; $ $ #Achievements - defCreatetree (data):#Recursive achievements - if(Len (data) >0):#If you have data thefea,datanode,leftset,rightset=Splitaxis (data) -Treenode=Node (none,none,fea,datanode);Wuyi if(Len (Leftset) >0):#whether the left can be divided thetreenode.left=Createtree (leftset); - if(Len (Rightset) >0):#whether the right can be divided Wutreenode.right=Createtree (rightset); - returnTreeNode; About $ #Recursive Search - defSearch (Tree,prenode):#Pernode means to query a sample; -dis=0; - forIinchTREE.DATANODE.COLUMNS[:-1]:#Calculate Distance ADis=dis+ (tree.datanode[i][0]-prenode[i][0]) **2; +dis=math.sqrt (dis); theLABEL=TREE.DATANODE[TREE.DATANODE.COLUMNS[-1]][0];#Current node Tag -Labell="'; $Labelr="'; the if(Tree.left!=none andPrenode[tree.fea][0] < tree.datanode[tree.fea][0]):#Search left theDisl,labell =Search (Tree.left, prenode); the if(Disl<dis):#take the smallest distance thedis=DisL -Label=Labell; in if(Dis > abs (prenode[tree.fea][0]-tree.datanode[tree.fea][0]):#Whether the hyper-sphere and the super-rectangular delivery determine whether to backtrack theDishr,labelhr=search (Tree.right,prenode);#Backtracking right subtree the if(dishr<dis): About returnDISHR,LABELHR the Else: the returnDis,label the + if(Tree.right!=none andPrenode[tree.fea][0] >= tree.datanode[tree.fea][0]):#Search Right -Disr,labelr=search (tree.right,prenode); the if(DisR < dis):#take the smallest distanceBayidis=DisR; theLabel=LABELR; the if(Dis > abs (prenode[tree.fea][0]-tree.datanode[tree.fea][0]):#Whether the hyper-sphere and the super-rectangular delivery determine whether to backtrack -Dishl,labelhl=search (Tree.left,prenode);#Backtrack left subtree - if(dishl<dis): the returnDishl,labelhl the Else: the returnDis,label the returnDis,label; - theData=getInfo (); theroot=createtree (data); theTEST=PD. DataFrame ([[7.1,1]], columns=list (data.columns[:-1]));94Dis,label=search (Root,test)
Machine Learning--kdtree Practice