1 environment
R 3.0 or later
To install the Machine learning package:
Description: The two packages are R machine learning packages. Rtexttools contains text processing, and e1071 contains classifiers.
> install.packages ("Rtexttools")
> install.packages ("e1071")
2 Experimental steps
Research object: http://www.xueqing.tv/cms/article/107#rd?sukey= 3903d1d3b699c20870d8c0b36a06c8665d146b24b47f8953d7202230c1ad9c9dd368d27959ec776c4cd0e2c94248f632
This blog post (pictured above is Word2vec, not related to this blog post) uses the R language to classify text and use multiple classifiers.
It consists of two parts, which are the emotional classification of sentences. Part is done using a small amount of data that is manually added.
The other part uses 80 happy data, 80 sad data, 10 happy test data, 10 sad test data (code path: Sentiment_analyse. R).
3 Test Results
Experiment one: For each classifier preliminary comparison, the author manually add data, make predictions (code file: Sentiment_compare. R):
Forecast accuracy Rate:
Classifier |
Accuracy rate |
Random Forest |
40g |
Maximum entropy |
40g |
Decision Tree |
40g |
BAGGING |
40g |
Svm |
20% |
Experiment two (code file Sentiment_analyse. R):
Data file: http:///sentiment/data/
Classification using Bayes, MAXENT, SVM, Slda, BAGGING, RF, tree classifier
The results are as follows:
Classifier Name |
Accuracy rate (R) |
Accuracy rate (spark) |
Bayesian |
65% |
85D |
Random Forest |
85D |
80D |
Svm |
85D |
|
Slda |
75% |
|
BAGGING |
85D |
|
Decision Tree |
100% |
95D |
Maxentropy |
85D |
|
GBT |
|
80D |
Vord2vec |
|
70% |
Sentiment analysis-R vs Spark Machine learning Library test Classification comparison