This assignment is a tree model to achieve the weather forecast, the figure part is not realized, but the frame part is realized.
Operating system: Win 10
Editing Environment: Anaconda
Python version: 3.6
Give the code First:
fromMathImportLogImportoperatordefCalcshannonent (DataSet):#entropy of calculated dataNumentries=len (DataSet)#number of data barslabelcounts={} forFeatvecinchDataset:currentlabel=FEATVEC[-1]#Last word of each line (category) ifCurrentLabel not inchLabelcounts.keys (): Labelcounts[currentlabel]=0 Labelcounts[currentlabel]+=1#count how many similar and the number of each classShannonent=0 forKeyinchLabelcounts:prob=float (Labelcounts[key])/numentries#calculate entropy for a single classShannonent-=prob*log (prob,2)#accumulate the entropy value of each class returnshannonentdefCreateDataSet1 ():#Creating Sample DataDataSet = [['Sunny Day','High Temperature','Medium Wet','no Wind','Not suitable'], ['Sunny Day','High Temperature','Medium Wet','with Wind','Not suitable'], ['Cloudy','High Temperature','Low Humidity','no Wind','suitable'], ['Rainy Day','Low Temperature','High Humidity','no Wind','suitable'], ['Rainy Day','Low Temperature','Low Humidity','no Wind','suitable'], ['Rainy Day','Low Temperature','Low Humidity','with Wind','Not suitable'], ['Cloudy','Low Temperature','Low Humidity','with Wind','suitable'], ['Sunny Day','Medium Temperature','High Humidity','no Wind','Not suitable'], ['Sunny Day','Low Temperature','Low Humidity','no Wind','suitable'], ['Rainy Day','Medium Temperature','Low Humidity','no Wind','suitable'], ['Sunny Day','Medium Temperature','Low Humidity','with Wind','suitable'], ['Cloudy','Medium Temperature','Medium Wet','with Wind','suitable'], ['Cloudy','High Temperature','Low Humidity','no Wind','suitable'], ['Rainy Day','Medium Temperature','Low Humidity','with Wind','Not suitable']] Labels= ['Weather','Temperature','Humidity','Wind Conditions']#Two characteristics returnDataset,labelsdefSplitdataset (Dataset,axis,value):#data sorted by a featureretdataset=[] forFeatvecinchDataSet:iffeatvec[axis]==Value:reducedfeatvec=Featvec[:axis] Reducedfeatvec.extend (Featvec[axis+1:]) Retdataset.append (Reducedfeatvec)returnRetdatasetdefChoosebestfeaturetosplit (DataSet):#Select the optimal classification featureNumfeatures = Len (dataset[0])-1baseentropy= Calcshannonent (DataSet)#the original entropyBestinfogain =0 Bestfeature=-1 forIinchRange (numfeatures): Featlist= [Example[i] forExampleinchDataSet] Uniquevals=Set (featlist) Newentropy=0 forValueinchUniquevals:subdataset=Splitdataset (dataset,i,value) Prob= Len (subdataset)/float (len (dataSet)) Newentropy+=prob*calcshannonent (Subdataset)#entropy after classification by featureInfogain = Baseentropy-newentropy#The difference of entropy between primitive entropy and classification by feature if(Infogain>bestinfogain):#if the entropy value is reduced by a certain characteristic, the sub-feature is the best classification feature .bestinfogain=Infogain bestfeature=IreturnbestfeaturedefMAJORITYCNT (classlist):#sorted by category number, for example: The last category is two men and one female, then judged as male:Classcount={} forVoteinchclasslist:ifVote not inchClasscount.keys (): Classcount[vote]=0 Classcount[vote]+=1Sortedclasscount= Sorted (Classcount.items (), Key=operator.itemgetter (1), reverse=True)returnSortedclasscount[0][0]defCreatetree (dataset,labels): Classlist=[EXAMPLE[-1] forExampleinchDataSet]#Category: Men and women ifClasslist.count (classlist[0]) = =Len (classlist):returnClasslist[0]ifLen (dataset[0]) ==1: returnmajoritycnt (classlist) bestfeat=choosebestfeaturetosplit (DataSet)#Select the optimal featureBestfeatlabel=Labels[bestfeat] Mytree={bestfeatlabel:{}}#classification results are saved as a dictionary del(Labels[bestfeat]) featvalues=[example[bestfeat] forExampleinchDataSet] Uniquevals=Set (featvalues) forValueinchUniquevals:sublabels=labels[:] Mytree[bestfeatlabel][value]=Createtree (Splitdataset (dataset,bestfeat,value), sublabels)returnmytreeif __name__=='__main__': dataSet, Labels=createdataset1 ()#Creating Sample Data Print(Createtree (Dataset,labels))#output Decision tree model
It achieves the following results:
To draw the model manually:
In addition, look at a use of the self-function of a writing, the author has not yet achieved, I hope you brainstorm:
https://zhuanlan.zhihu.com/p/25428390
This article refers to links:
http://blog.csdn.net/csqazwsxedc/article/details/65697652
Python implements a weather decision tree model