In the machine to learn the actual combat time, to the third chapter of the decision tree when drawing, there is a recursive function how can not understand, because later want to choose this direction for their career-oriented, with a fine look at the attitude of the tree for the carpet scan, so did not skip, has been card more than a day, just about understand, To understand the value of the Plottree.xoff in that function, and the method of calculating cntrpt, I believe that some people like me, hope to communicate with each other.
First put the code here:
ImportMatplotlib.pyplot as Plt#Here are some definitions of drawing properties that can be used without a tube, mainly behind the algorithmDecisionnode = Dict (boxstyle="Sawtooth", fc="0.8") Leafnode= Dict (boxstyle="Round4", fc="0.8") Arrow_args= Dict (arrowstyle="<-")#This is the number of leaf nodes in the recursive tree, which is relatively simpledefGetnumleafs (mytree): Numleafs=0 Firststr=Mytree.keys () [0] seconddict=Mytree[firststr] forKeyinchSeconddict.keys ():ifType (Seconddict[key]).__name__=='Dict':#test to see if the nodes is dictonaires, if not they is leaf nodesNumleafs + =Getnumleafs (Seconddict[key])Else: Numleafs +=1returnNumleafs#This is the depth of the recursive calculation tree, relatively simpledefgettreedepth (mytree): MaxDepth=0 Firststr=Mytree.keys () [0] seconddict=Mytree[firststr] forKeyinchSeconddict.keys ():ifType (Seconddict[key]).__name__=='Dict':#test to see if the nodes is dictonaires, if not they is leaf nodesthisdepth = 1 +gettreedepth (Seconddict[key])Else: thisdepth = 1ifthisdepth > maxdepth:maxdepth =thisdepthreturnmaxDepth#This is used to draw the node and arrow lines in an annotation form, without having to pipedefPlotnode (Nodetxt, Centerpt, PARENTPT, NodeType): CreatePlot.ax1.annotate (nodetxt, XY=PARENTPT, xycoords='axes fraction', Xytext=centerpt, textcoords='axes fraction', VA="Center", ha="Center", Bbox=nodetype, arrowprops=Arrow_args)#This is used to draw the labels on the line, simpledefPlotmidtext (Cntrpt, PARENTPT, txtstring): Xmid= (Parentpt[0]-cntrpt[0])/2.0 +Cntrpt[0] Ymid= (Parentpt[1]-cntrpt[1])/2.0 + cntrpt[1] CreatePlot.ax1.text (Xmid, Ymid, txtstring, VA="Center", ha="Center", rotation=30)#Focus, recursion, decide the whole tree drawing, difficult (oneself think)defPlottree (Mytree, PARENTPT, Nodetxt):#If the first key tells you feat is split onNumleafs = Getnumleafs (mytree)#This determines the x width of this treedepth =gettreedepth (mytree) firststr= Mytree.keys () [0]#The text label for this node should is thisCntrpt = (Plottree.xoff + (1.0 + float (numleafs))/2.0/PLOTTREE.TOTALW, Plottree.yoff) plotmidtext (Cntrpt, PARENTPT, Nodetxt) Plotnode (Firststr, cntrPt, ParentPt, Decisionnode) Seconddict=Mytree[firststr] Plottree.yoff= plottree.yoff-1.0/Plottree.totald forKeyinchSeconddict.keys ():ifType (Seconddict[key]).__name__=='Dict':#test to see if the nodes is dictonaires, if not they is leaf nodesPlottree (Seconddict[key],cntrpt,str (key))#recursion Else:#it ' s a leaf node print the leaf nodePlottree.xoff = Plottree.xoff + 1.0/plottree.totalw Plotnode (Seconddict[key], (Plottree.xoff, Plottree.yoff), Cntrpt, Leafnode) plot Midtext ((Plottree.xoff, Plottree.yoff), Cntrpt, str (key)) Plottree.yoff= Plottree.yoff + 1.0/Plottree.totald#If you do get a dictonary know it's a tree, and the first element would be another dict#This is the real drawing, the top is the logical drawingdefCreateplot (intree): Fig= Plt.figure (1, facecolor=' White') FIG.CLF () Axprops= Dict (xticks=[], yticks=[]) createplot.ax1= Plt.subplot (111, Frameon=false)#No TicksPLOTTREE.TOTALW =Float (Getnumleafs (intree)) Plottree.totald=Float (gettreedepth (intree)) Plottree.xoff= -0.5/PLOTTREE.TOTALW; Plottree.yoff = 1.0; Plottree (Intree, (0.5,1.0),"') plt.show ()#This is used to create a data set that is a decision treedefRetrievetree (i): Listoftrees=[{'No surfacing': {0:{'Flippers': {0:'No', 1:'Yes'}}, 1: {'Flippers': {0:'No', 1:'Yes'}}, 2:{'Flippers': {0:'No', 1:'Yes'}}}}, {'No surfacing': {0:'No', 1: {'Flippers': {0: {'Head': {0:'No', 1:'Yes'}}, 1:'No'}}}} ] returnListoftrees[i]createplot (retrievetree (0))
The graph is drawn as follows:
Pilot: Here to say why a recursive tree draw why it is difficult to understand, here is not the use of recursive function to draw, just like recursive calculation tree depth, leaf node, the problem is not the idea of recursion, but the book of some coordinates of the starting value, and in the calculation of the coordinates of the node processing, And in the tree does not take this part of the story, so in the view of this code may be a general idea to understand but the specific details are not known, so this article is mainly mentioned in the book in detail, and of course, the whole idea of the code will not be spared
Preparation: Here is the time to draw a custom plotnode function that draws an arrow and a node at a time, such as:
Idea: Here to draw, the author chose a very clever way, and not because the tree node of the increase and decrease and the depth of the graph caused by the problem, of course, not too dense. This is done by dividing the length of the entire x-axis by the number of leaf nodes of the whole tree, using the depth of the tree as the number of copies to divide the y-axis length equally, Using Plottree.xoff as the x-coordinate of a recently drawn leaf node, the Plottree.xoff will only change when the leaf node coordinates are drawn again. With Plottree.yoff as the current depth of drawing, Plottree.yoff is in each recursive layer will be reduced by one (above the average split), the other time is to use these two coordinate points to calculate the non-leaf node, these two parameters can actually determine a point coordinates, the coordinates are determined when the time to draw the node
The recursive approach to the overall algorithm is easy to understand:
Each time is divided into three steps:
(1) Drawing itself
(2) Determining the non-leaf node of a child node, recursive
(3) Judge the child nodes as leaf nodes and draw
Detailed analysis:
defPlottree (Mytree, PARENTPT, Nodetxt):#If the first key tells you feat is split onNumleafs = Getnumleafs (mytree)#This determines the x width of this treedepth =gettreedepth (mytree) firststr= Mytree.keys () [0]#The text label for this node should is this cntrpt = (Plottree.xoff + (1.0 + float (numleafs))/2.0/ PLOTTREE.TOTALW, Plottree.yoff) plotmidtext (Cntrpt, PARENTPT, Nodetxt) Plotnode (Firststr, CntrPt, p ARENTPT, Decisionnode) seconddict=Mytree[firststr] Plottree.yoff= plottree.yoff-1.0/Plottree.totald forKeyinchSeconddict.keys ():ifType (Seconddict[key]).__name__=='Dict':#test to see if the nodes is dictonaires, if not they is leaf nodesPlottree (Seconddict[key],cntrpt,str (key))#recursion Else:#it ' s a leaf node print the leaf nodePlottree.xoff = Plottree.xoff + 1.0/plottree.totalw Plotnode (Seconddict[key], (Plottree.xoff, Plottree.yoff), Cntrpt, Leafnode) plot Midtext ((Plottree.xoff, Plottree.yoff), Cntrpt, str (key)) Plottree.yoff= Plottree.yoff + 1.0/Plottree.totald#If you do get a dictonary know it's a tree, and the first element would be another dictdefCreateplot (intree): Fig= Plt.figure (1, facecolor=' White') FIG.CLF () Axprops= Dict (xticks=[], yticks=[]) createplot.ax1= Plt.subplot (111, Frameon=false)#No TicksPLOTTREE.TOTALW =Float (Getnumleafs (intree)) Plottree.totald=Float (gettreedepth (intree)) Plottree.xoff = -0.5/PLOTTREE.TOTALW; Plottree.yoff = 1.0 ; #totalW为整树的叶子节点树, Totald for depth plottree (Intree, ( 0.5,1.0),"') plt.show ()
The red part of the above code is so handled:
First, the entire canvas is split evenly based on the number of leaf nodes and depth, and the total length of the x-axis is 1, which is like:
1, where the square is a non-leaf node position, @ is the position of the leaf node, so that the length of each table should be 1/plottree.totalw, but the position of the leaf node should be @ where the position, then at the beginning of the Plottree.xoff assignment is -0.5/ PLOTTREE.TOTALW, which means that the starting X position is the half-table distance to the left of the first table, the benefit is: The 1/PLOTTREE.TOTALW can be added to the integer multiples at the time of the @ position.
2, the red part of the Plottree function is as follows:
Cntrpt = (Plottree.xoff + (1.0 + float (numleafs))/2.0/PLOTTREE.TOTALW, Plottree.yoff)
Plottree.xoff is the x-coordinate of a recently drawn leaf node, and at the time of determining the position of the current node, it is only necessary to determine the current node has several leaf nodes, so the total distance of its leaf node is determined to be float (NUMLEAFS)/plottree.totalw* 1 (because the total length is 1), so the position of the current node is the middle half of the distance of all its leaf nodes is float (NUMLEAFS)/2.0/plottree.totalw* 1, but since the start Plottree.xoff assignment is not starting from 0, but the left half of the table, so also need to add half the table distance is 1/2/plottree.totalw*1, then add up is (1.0 + float (numleafs))/2.0/ Plottree.totalw*1, so the offset is determined, then the X position becomes Plottree.xoff + (1.0 + float (numleafs))/2.0/PLOTTREE.TOTALW
3, for Plottree function parameter assignment is (0.5, 1.0)
Because the starting root node is not underlined, the position of the parent node and the current node needs to be coincident, using 2 to determine the position of the current node is (0.5, 1.0)
Summary: The use of such a gradual increase in the coordinates of X, and gradually reduce the coordinates of the Y can be very good to take into account the number of leaves and depth of the tree, so the logical ratio of the graph is very good to determine, so do not care about the size of the output graph, once the shape changes, the function will be redrawn, But if you use pixels to draw graphics, it's more difficult to scale a graphic.
[Machine learning & Data Mining] machine learning combat decision tree Plottree function fully resolved