Affective analysis of Chinese text: A machine learning method based on machine learning

Source: Internet
Author: User
Tags svm

1. Common steps


2. Chinese participle

1 This is relative to the English text affective analysis, Chinese unique preprocessing.

2 Common methods: Based on the dictionary, rule-based, Statistical, based on the word annotation, based on artificial intelligence.

3 Common tools: Hit-language cloud, Northeastern University Niutrans statistical Machine translation system, the Chinese Academy of Sciences Zhang Huaping Dr. Ictclas, Posen technology, stuttering participle, ansj participle, hanlp.


3. Feature Extraction

1 What the text takes as a feature.

2) commonly used methods: According to the part of speech (adj, adv, v), words are combined (Unigram, Bigram), location.

3 Use the combination of words to represent text, two ways: the occurrence of words or not, the number of words appear.


4. Feature Selection

1 Select which features, if all of the characteristics of the calculation as a feature, that the computation is very large, high dimensional sparse matrix.

2) commonly used methods: to stop the use of words, chi-square, mutual information.

3) Common tools: Word2vector, Doc2vec


5. Classification model

1) training, testing.

2 Common methods: Naive Bayesian, maximum entropy, SVM.


6. Evaluation indicators

1) Accuracy rate

Accuracy = (TP + TN)/(TP + FN + FP + TN) reflects the ability of the classifier to judge the whole sample--------------------positive judgment, negative judgment negative.

2) Accuracy rate

Precision = tp/(TP+FP) reflects the proportion of the true positive sample in the positive case determined by the classifier

3) Recall rate

Recall = tp/(TP+FN) reflects the proportion of positive cases that are correctly judged as the total positive case


7. Available resources

1 Chinese Word segmentation basic Algorithm Introduction

2) Ictclas Chinese pos annotation Set

3) Text Classification technology

4 Text categorization and SVM

5 text categorization algorithm based on Bayesian algorithm

6 based on LIBSVM Chinese text classification prototype

7) lda-math-Text modeling

8 Emotional Analysis Resources

9 feature extraction technology for affective analysis

9.1. The seventh course of natural language processing at Stanford University-affective analysis

10 depth learning, natural language processing and characterization methods

Deep Learning in NLP (one) word vector and language model




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.