Install.packages ("Randomforest") #安装R包
Library (Party) #输入数据
Library (randomforest) #引入分析包
Output.forest <-Randomforest (nativespeaker ~ Age + shoesize + score,
data = readingskills) #创建随机森林
Print (output.forest) #查看
Print (Importance (Output.forest,type = 2)) #Gini指数
The Gini index indicates the purity of the node, and the greater the Gini index, the lower the purity. The average reduction of the Gini value indicates the average reduction of the purity of the variable partition nodes of all trees. For the variable importance measure, the steps are described earlier, the variable data is disturbed, and the mean value of the Gini exponential change is measured as the important degree of the variable.
The results are as follows:
Varimpplot (output.forest) #可视化
From the random forest shown above, we can conclude that shoe codes and grades are important factors in determining if someone is a native speaker or not a mother tongue. Furthermore, the model has only 1%~2% errors, which means we can predict the accuracy to be 98%.
R-Language random forest algorithm