The first function is the function of calculating entropy
<span style= "FONT-SIZE:18PX;" >function result=centropy (propertylist) result=0;totallength=length (propertylist); ItemList=unique ( propertylist);p num=length (itemList); for I=1:pnum itemlength=length (Find (Propertylist==itemlist (i))); Pitem=itemlength/totallength; RESULT=RESULT-PITEM*LOG2 (Pitem);end</span>
Implement the main function of decision tree model
<span style= "FONT-SIZE:18PX;" >function Decisiontreemodel=decisiontree (data,label,propertyname) Global Rootnode;global Node;rootNode=struct (' NodeName ', []); Node=struct (' Fathernodename ', [], ' edgeproperty ', [], ' NodeName ', []); Rootindex=calcutenode (Data,label); Datarowindex=setdiff (1:length (PropertyName), Rootindex), Rootnode.nodename=propertyname (RootIndex);p Ropertyname ( Rootindex) =[];rootdata=data (:, Rootindex); Sonedge=unique (rootdata); for I=1:length (Sonedge) edgedataindex= Find (Rootdata==sonedge (i)); Buildtree (Rootnode.nodename,sonedge (i), data (Edgedataindex,datarowindex), label (Edgedataindex,:), PropertyName); Endmodel.rootnode=rootnode;model. Node=node;decisiontreemodel=model;</span>
The decision tree Model main function needs to call a recursive function to construct a node other than the root node
<span style= "FONT-SIZE:18PX;" >function [Output_args] = Buildtree (fathernodename,edge,data,label,propertyname)%UNTITLED9 Summary of this Function goes here% detailed explanation goes Hereglobal rootnode;global node;% rootnode=struct (' NodeName ', []);% Node=struct (' Fathernodename ', [], ' edgeproperty ', [], ' NodeName ', []); K=length (Node) +1; Node (k). Fathernodename=fathernodename; Node (k). Edgeproperty=edge;if Length (unique label) ==1 Node (k). Nodename=label (1); Return;endsonindex=calcutenode (Data,label);d Atarowindex=setdiff (1:length (PropertyName), SonIndex); Node (k). Nodename=propertyname (Sonindex);p ropertyname (Sonindex) =[];sondata=data (:, Sonindex); Sonedge=unique (sonData); for I=1:length (Sonedge) edgedataindex=find (Sondata==sonedge (i)); Buildtree (Node (k). Nodename,sonedge (i), data (Edgedataindex,datarowindex), label (Edgedataindex,:), PropertyName);endend</span>
The function that is used to return the next feature as a child node
<span style= "FONT-SIZE:18PX;" >function [Nodeindex]=calcutenode (Data,label) largeentropy=centropy (label); [M,n]=size (data); Entropygain=largeentropy*ones (1,n); for I=1:n Pdata=data (:, i); Itemlist=unique (pData); For J=1:length (itemList) Itemindex=find (Pdata==itemlist (j)); Entropygain (i) =entropygain (i)-length (itemIndex)/m*centropy (label (ItemIndex)); End % is run as gain rate, commented out as gain% entropygain (i) =entropygain (i)/centropy (pData); End[~,nodeindex]=max (Entropygain );end</span>
Next, the test uses the main function:
<span style= "FONT-SIZE:18PX;" >clear;clc;outlooktype=struct (' Sunny ', 1, ' Rainy ', 2, ' overcast ', 3); Temperaturetype=struct (' hot ', 1, ' warm ', 2, ' cool ', 3); Humiditytype=struct (' High ', 1, ' norm ', 2); windytype={' True ', 1, ' False ', 0}; playgolf={' Yes ', 1, ' No ', 0};d ata=struct (' Outlook ', [], ' temperature ', [], ' humidity ', [], ' windy ', [], ' playgolf ', []); O utlook=[1,1,3,2,2,2,3,1,1,2,1,3,3,2] '; temperature=[1,1,1,2,3,3,3,2,3,3,2,2,1,2] '; humidity=[1,1,1,1,2,2,2,1,2,2,2,1,2,1] '; windy=[0,1,0,0,0,1,1,0,0,0,1,1,0,1] ';d ata=[outlook temperature humidity windy]; playgolf=[0,0,1,1,1,0,1,0,1,1,1,1,1,0] ';p ropertyname={' Outlook ', ' temperature ', ' humidity ', ' windy '}; Decisiontreemodel=decisiontree (Data,playgolf,propertyname);</span>
Decision tree Model (MATLAB)