Since I am installing PYTHON3, I am installing the Jieba participle module via PIP3:
1 pip3 Install Jieba
After we have completed the above step, we have successfully installed the Jieba participle module, and we will perform a simple test to see if we have successfully installed it:
1 #-*-coding:utf-8-*-2 3 #introduction of stuttering participle module4 ImportJieba5 6 #Defining Strings7s = u'What's the weather like in Hangzhou today? '8 9 #call the Cut word method of Jieba participleTenCut =Jieba.cut (s) One A #Output Results - Print('"Output"') - Print(cut) the Print(','. Join (CUT))
Let's take a look at the results:
It can be concluded that the result of the cut returns a generator, and the final participle result is: Hangzhou, today's weather, how,?
By this step, we can prove that our stuttering participle module has been successfully installed.
Three modes of stuttering participle
Next we need to look at the use of stuttering participle, the first introduction of stuttering participle of three modes: accurate mode, full mode, search engine mode, the following examples we introduce:
1 #-*-coding:utf-8-*-2 3 #introduction of stuttering participle module4 ImportJieba5 6 #Defining Strings7s = u'I want to drive my girlfriend to Hangzhou West Lake to visit and play on New Year's Day. How do I get to the route? '8 9 #Precision ModeTen Print('Precise mode:') One Print('|'. Join (Jieba.cut (s))) A Print('\ n') - - #Full Mode the Print('Full mode:') - Print('|'. Join (Jieba.cut (S,cut_all =True ))) - Print('\ n') - + #Search engine Mode - Print('search engine mode:') + Print('|'. Join (Jieba.cut_for_search (s)))
From the above can be seen, the full pattern of the word segmentation is more and more complete.
Stuttering word of speech
We know that each word has its part of speech, such as: verbs, adjectives, nouns and so on, using the posseg of the stuttering participle of this module, you can get the part of speech, for example:
1 #-*-coding:utf-8-*-2 3 #introduction of stuttering word -of-speech module4 ImportJieba.posseg as PSG5 6 #Defining Strings7s = u'I want to drive my girlfriend to Hangzhou West Lake to visit and play on New Year's Day. How do I get to the route? '8 9 #get part of speechTen Print('participle Result:') One Print([(X.word,x.flag) forXinchPsg.cut (s)])
We can successfully get each word's part of speech, which is useful for us to further deal with the word segmentation results, at the same time, we can only get a word in the list of segmentation results, such as: to get the word in the list of results, then we can filter:
1 #-*-coding:utf-8-*-2 3 #introduction of stuttering word -of-speech module4 ImportJieba.posseg as PSG5 6 #Defining Strings7s = u'I want to drive my girlfriend to Hangzhou West Lake to visit and play on New Year's Day. How do I get to the route? '8 9 #getting part of speech is the result of nounsTen Print('to get the word of speech is a noun:') One Print([(X.word,x.flag) forXinchPsg.cut (s)ifX.flag.startswith ('N')])
What has been obtained is the result set of the words in our sentence that may be nouns. If you want to know what part of speech each letter represents, let's introduce it in the future (sort by the initials in English):
1. Adjectives (one class, 4 two classes)
A adjective
Ad sub-type word
An noun
adjective morpheme of AG
Al adjective idiomatic language
2. Distinguishing words (one class, 2 two classes)
b Distinguishing Words
BL distinguishes the idiomatic phrase of speech
3, conjunctions (one class, one class two)
C conjunctions
CC parallel conjunctions
4. Adverbs (one class)
D adverb
5. interjection (one Class)
E interjection
6. Locality of nouns (one class)
f noun
7. prefix (one class)
H prefix
8. suffix (one class)
K suffix
9. numerals (one class, one two category)
M numerals
MQ number of words
Ten, noun (a class, 7 two classes, 5 three categories)
Nouns are divided into the following sub-categories:
n noun
NR Name
NR1 Chinese surname
NR2 Chinese name
NRJ Japanese names
NRF transliteration of names
NS Place Names
NSF Transliteration of place names
NT Institution Group name
NZ other proper names
NL noun Idiomatic language
ng noun morpheme
One, quasi-sound words (a Class)
o Quasi-sound words
Prepositions (one class, 2 two classes)
P Prepositions
PBA preposition "put"
Pbei preposition "by"
Measure Words (one class, 2 two classes)
Q quantifier
QV Moving quantifiers
QT Time quantifier
Pronouns (one class, 4 two classes, 6 categories)
R pronoun
RR Personal pronouns
RZ demonstrative pronoun
Rzt Time demonstrative pronoun
Rzs Quarter demonstrative pronoun
RZV predicate pronoun of part of speech
Ry interrogative pronouns
Ryt Time interrogative pronoun
Rys Quarter interrogative pronoun
RYV predicate interrogative pronoun of part of speech
RG Generation of speech morphemes
The wordof the premises (one class)
S quarter Word
Time Words (one class, one category two)
T-time words
TG Time Speech morpheme
x, Auxiliary (one class, 15 two classes)
U particle
Uzhe.
Ule, huh?
Uguo.
The bottom of the Ude1
Ude2 Ground
Ude3.
The Usuo
Udeng and so on.
Uyy as usual.
Udh words
In the case of ULS,
The Uzhi
Ulian ("Even elementary school students")
Verbs (one class, 9 two classes)
V Verb
VD Secondary verb
VN noun verb
Vshi verb "yes"
vyou verb "there"
VF Trend Verb
VX form verb
VI intransitive verb (inner verb)
VL Verb Idioms
VG Verb morpheme
Punctuation (one class, 16 two classes)
W Punctuation
Wkz opening parenthesis, full width: (([{"〖〈 Half angle: ([{<
Wky right parenthesis, full width:)]} "〗〉 half-width:)" {>
Wyz left quotation mark, full angle: "'"
Wyy Right quotation mark, full angle: "'"
WJ period, full angle:.
WW question mark, full angle:? Half-width:?
WT exclamation mark, full angle:! Half-width:!
WD comma, full-width:, half-width:,
WF semicolon, full-width:; half-width:;
WN comma, full angle:,
WM Colon, full angle:: Half angle::
WS ellipsis, full-width: ...
WP Dash, full angle:-half angle:-------
WB percent semicolon, full angle:%‰ half angle:%
WH unit symbol, full angle: ¥$£°℃ half angle: $
String (one class, 2 two classes)
X string
XX non-morpheme word
Xu URL url
(A classof words)
Y modal words (delete yg)
The state Word (a class)
Z State Word
Python3.6 stutter participle installation and use