We take the purchase of housing as an example to introduce the use of decision tree algorithm, the data set is as follows (demo only, does not represent the real situation)
Lot |
Near Subway |
Area |
Unit Price (million) |
Whether to buy |
Three Rings |
Is |
60 |
8 |
Is |
Three Rings |
Is |
80 |
8 |
Whether |
Three Rings |
Whether |
60 |
7 |
Is |
Three Rings |
Whether |
80 |
7 |
Whether |
Five Rings |
Is |
60 |
7 |
Is |
Five Rings |
Is |
80 |
7 |
Whether |
Five Rings |
Whether |
60 |
6 |
Is |
Five Rings |
Whether |
80 |
6 |
Is |
Six rings |
Is |
60 |
6 |
Is |
Six rings |
Is |
80 |
5.5 |
Is |
Six rings |
Whether |
60 |
5 |
Whether |
Six rings |
Whether |
80 |
5 |
Whether |
As we can see from the table above, there are 7 quantities to be purchased, 5 for the number of purchases, and 12 for the total number of units. According to the formula of information entropy we can conclude that the information entropy of this data set is:
Divided by lot (denoted by A1), Tri-Ring (D1), five-ring (D2), six-ring (D3), to calculate information gain
By whether near the Metro (denoted by A2), is (D1), no (D2), to calculate the information gain
divided by area (denoted by A3), 60 ping (D1), 80 ping (D2), to calculate information gain
Divided by unit Price (expressed in A4), 5w (D1), 5.5w (D2), 6w (D3), 7w (D4), 8w (D5), to calculate information gain
Through the above results we can know that the reduction of information entropy (that is, people decide whether to buy the weight of the determinants of housing) from high to Low, respectively: unit price, area, location, whether near the subway .
The above algorithm is the logic used in the ID3 algorithm in decision tree algorithm.
Note: Quantities are only used as test data for demonstration purposes and do not represent real decision-making basis.
Follow the public number "kick genius" for more AI technical articles
Using the Buy house as an example to illustrate the use of decision tree algorithms-ai machine learning