NLP SNOWNLP Practical Use Cases

Last Update:2018-07-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SNOWNLP is a python-written class library that can easily handle Chinese text content. such as Chinese word segmentation, POS tagging, affective analysis, text categorization, extraction of text keywords, text similarity calculation.

#-*-Coding:utf-8-*-from SNOWNLP import snownlp s = SNOWNLP (' This thing is really awesome ') print (' Chinese participle: ') print (s.words) # [u ' this ', U ' Things ', U ' true ', # u ' very ', U ' Praise '] print () print (' pos: ') print (s.tags) # [(U ' This ', U ' r ') n '), # (U ' true ', U ' d '), (U ' very ', U ' d '), # (U ' praise ', U ' Vg ')] print (' Emotion Analysis: ') print (s.sent  iments) # 0.9769663402895832 positive probability print () print (' Convert to Pinyin: ') #汉转拼音 print (s.pinyin) # [u ' Zhe ', U ' ge ', U ' Dong ', U ' XI ', # u ' zhen ', U ' xin ', U ' hen ', u ' Zan ' Print () print (' Traditional to Simplified: ') s = SNOWNLP (' The Chinese language "is also very common in Taiwan.) #简转繁 print (S.han) # ' Traditional Chinese characters ' is also common in Taiwan.
' Print () Text = ' Natural language processing is an important direction in the field of computer science and artificial intelligence.
It studies various theories and methods that can realize effective communication between human and computer using natural language.
Natural language processing is a science which integrates linguistics, computer science and mathematics.
Therefore, the research in this field will involve natural language, that is, the language that people use everyday, so it is closely related to the study of linguistics, but it has important difference. Natural language processing is not the study of natural language in general, but the development of computer systems which can effectively realize natural language communication, especially the software system.
So it's part of computer science. ' s = SNOWNLP (text) print (' Extract text keywords: ') print (S.keywords (3)) # [' Language ', ' nature ', ' computer '] print () prinT (' Extract text summary: ') print (S.summary (3)) # [' Thus it is part of Computer science ', # ' natural language processing is a branch of linguistics, Computer Science, # Mathematics. Learning ', # ' natural language processing is an important direction in the field of computer science and Artificial Intelligence # "print () print (' Split into sentence: ') print ( s.sentences) print () s = SNOWNLP ([' This article ', ' article ', ' true ', ' good '], [' That article ', ' paper '], [' This ']] print (' word frequency: ') pr Int (S.TF) #词频 print () print (' Reverse file frequency: ') print (S.IDF) #逆向文件频率 print () print (' text similar: ') print (S.sim [' article ']) # [0.37560707629 85226, 0, 0] print (S.sim ([' article ', ' True ']) # [0.7731414846187967, 0, 0]

Output:

Chinese participle:
[' This ', ' thing ', ' sincerity ', ' very ', ' Praise ']

pos annotation:
<zip object at 0x12638b388>

affective analysis:
0.9769551298267365 to

pinyin:
[' Zhe ', ' ge ', ' dong ', ' XI ', ' zhen ', ' xin ', ' hen ', ' zan ']

traditional simplified: "
Traditional Chinese" The term "Traditional Chinese" is also common in Taiwan.

Extract text keywords:
[' Language ', ' nature ', ' computer ']

Extract Text summary:
[' Thus it is part of computer science ', ' natural language processing is an important direction in the field of computer science and Artificial intelligence ', ' natural language processing is a science that integrates linguistics, computer science and Mathematics ']

into sentences:
[' Natural language processing is an important direction in the field of computer science and Artificial intelligence ', ' it studies a variety of theories and methods that enable effective communication between people and computers in natural language, ' and ' natural language processing is a science of linguistics, computer Science and mathematics ', ' so ' ' Research in this field will involve natural language ', ' the language that people use everyday ', ' so it is closely related to the study of linguistics ', ' but there are important differences ', ' natural language processing is not a general study of natural language ', ' but the development of computer systems that can effectively realize natural language communication ', ' Especially the software system ', ' thus it is part of the computer Science '] Word

frequency:
[{' This article ': 1, ' article ': 1, ' true ': 1, ' Good ': 1}, {' That ': 1, ' thesis ': 1}, {' This ': 1}]

reverse file Frequency:
{' This article ': 0.5108256237659907, ' article ': 0.5108256237659907, ' true ': 0.5108256237659907, ' good ': 0.5108256237659907, ' that article ': 0.5108256237659907, ' thesis ': 0.5108256237659907, ' this ': 0.5108256237659907}

text similar:
[0.38657074230939836, 0, 0]
[0.7731414846187967, 0, 0]

About training (participle, POS tagging, affective analysis):

From SNOWNLP import seg
seg.train (' data.txt ')
seg.save (' Seg.marshal ')
# from SNOWNLP import Tag
# Tag.train (' 199801.txt ')
# tag.save (' Tag.marshal ')
# from SNOWNLP import Sentiment
# Sentiment.train (' Neg.txt ', ' Pos.txt ')
# sentiment.save (' Sentiment.marshal ')

PS: The training of the file is stored as Seg.marshal, and then modify the snownlp/seg/__init__.py in the Data_path point to just training good files can
or point to your own training address.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

NLP SNOWNLP Practical Use Cases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

NLP SNOWNLP Practical Use Cases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support