Sentiment analysis-R vs Spark Machine learning Library test Classification comparison

Source: Internet
Author: User
Tags svm

1 environment

R 3.0 or later

To install the Machine learning package:

Description: The two packages are R machine learning packages. Rtexttools contains text processing, and e1071 contains classifiers.

> install.packages ("Rtexttools")

> install.packages ("e1071")

2 Experimental steps

Research object: http://www.xueqing.tv/cms/article/107#rd?sukey= 3903d1d3b699c20870d8c0b36a06c8665d146b24b47f8953d7202230c1ad9c9dd368d27959ec776c4cd0e2c94248f632

This blog post (pictured above is Word2vec, not related to this blog post) uses the R language to classify text and use multiple classifiers.

It consists of two parts, which are the emotional classification of sentences. Part is done using a small amount of data that is manually added.

The other part uses 80 happy data, 80 sad data, 10 happy test data, 10 sad test data (code path: Sentiment_analyse. R).

3 Test Results

Experiment one: For each classifier preliminary comparison, the author manually add data, make predictions (code file: Sentiment_compare. R):

Forecast accuracy Rate:

Classifier

Accuracy rate

Random Forest

40g

Maximum entropy

40g

Decision Tree

40g

BAGGING

40g

Svm

20%

Experiment two (code file Sentiment_analyse. R):

Data file: http:///sentiment/data/

Classification using Bayes, MAXENT, SVM, Slda, BAGGING, RF, tree classifier

The results are as follows:

Classifier Name

Accuracy rate (R)

Accuracy rate (spark)

Bayesian

65%

85D

Random Forest

85D

80D

Svm

85D

Slda

75%

BAGGING

85D

Decision Tree

100%

95D

Maxentropy

85D

GBT

80D

Vord2vec

70%

Sentiment analysis-R vs Spark Machine learning Library test Classification comparison

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.