ImportJiebaImportNumPy as NP#Open dictionary file, return to listdefOpen_dict (dict='Hahah', Path = R'/users/zhangzhenghai/downloads/textming/'): Path= Path +'%s.txt'%Dict Dictionary= Open (Path,'R', encoding='Utf-8') Dict= [] forWordinchDictionary:word= Word.strip ('\ n') dict.append (word)returnDictdefjudgeodd (num):ifNum% 2 = =0:return 'even' Else: return 'Odd'Deny_word= Open_dict (dict='Negative Words') Posdict= Open_dict (dict='Positive') Negdict= Open_dict (Dict ='Negative') Degree_word= Open_dict (Dict ='degree level words', Path=r'/users/zhangzhenghai/downloads/textming/') Mostdict= Degree_word[degree_word.index ('Extreme') +1:degree_word.index ('very')]#weight 4, which is multiplied by 3 before emotionVerydict = Degree_word[degree_word.index ('very') +1:degree_word.index (' More')]#Weight 3Moredict = Degree_word[degree_word.index (' More') +1:degree_word.index ('Ish')]#Weight 2Ishdict = Degree_word[degree_word.index ('Ish') +1:degree_word.index (' Last')]#Weight 0.5defsentiment_score_list (DataSet): Seg_sentence= Dataset.split ('. ') Count1=[] Count2= [] forSeninchSeg_sentence:#Loop through each commentSegtmp = Jieba.lcut (sen, Cut_all=false)#To return a sentence in the form of a listi = 0#record the location of the scanned wordA = 0#record the location of emotional wordsPoscount = 0#the first score of positive wordsPoscount2 = 0#the score after positive reversalPoscount3 = 0#The final score of the positive word (including the score of the exclamation mark)Negcount =0 Negcount2=0 Negcount3=0 forWordinchsegtmp:ifWordinchPosdict:#Judging whether words are emotional wordsPoscount +=1C=0 forWinchSEGTMP[A:I]:#scan the degree words before the emotional word ifWinchMostdict:poscount*= 4.0elifWinchVerydict:poscount*= 3.0elifWinchMoredict:poscount*= 2.0elifWinchIshdict:poscount*= 0.5elifWinchdeny_word:c+= 1ifJudgeodd (c) = ='Odd':#the number of negative words before scanning emotional wordsPoscount *=-1.0Poscount2+=Poscount Poscount=0 Poscount3= Poscount + Poscount2 +Poscount3 Poscount2=0Else: Poscount3= Poscount + Poscount2 +Poscount3 Poscount=0 A= I+1elifWordinchNegdict:#the analysis of negative emotions, consistent with the aboveNegcount + = 1D=0 forWinchSEGTMP[A:I]:ifWinchMostdict:negcount*= 4.0elifWinchVerydict:negcount*= 3.0elifWinchMoredict:negcount*= 2.0elifWinchIshdict:negcount*= 0.5elifWinchDegree_word:d+ = 1ifJudgeodd (d) = ='Odd': Negcount*=-1.0Negcount2+=Negcount Negcount=0 Negcount3= Negcount + Negcount2 +Negcount3 Negcount2=0Else: Negcount3= Negcount + Negcount2 +Negcount3 Negcount=0 A= i + 1elifWord = ='! ' orWord = ='!':#determine if the sentence has an exclamation point forW2inchSEGTMP[::-1]:#Scan the emotional word before the exclamation point, find the right value +2, then exit the Loop ifW2inchPosdictorNegdict:poscount3+ = 2Negcount3+ = 2 BreakI+ = 1#Here are the cases where negative numbers are preventedPos_count =0 Neg_count=0ifPoscount3 <0 andNegcount3 >0:neg_count+ = Negcount3-Poscount3 Pos_count=0elifNegcount3 <0 andPoscount3 >0:pos_count= Poscount3-Negcount3 Neg_count=0elifPoscount3 <0 andNegcount3 <0:neg_count= -Pos_count Pos_count= -Neg_countElse: Pos_count=Poscount3 Neg_count=negcount3 count1.append ([Pos_count,neg_count]) count2.append (count1) count1=[] returnCount2defSentiment_score (senti_score_list): Score= [] forReviewinchSenti_score_list:score_array=Np.array (review) Pos=np.sum (score_array[:,0]) Neg= Np.sum (score_array[:,1]) Avgpos=Np.mean (score_array[:,0]) Avgpos= Float ('%.LF'%Avgpos) Avgneg= Np.mean (score_array[:, 1]) Avgneg= Float ('%.1f'%Avgneg) Stdpos=np.std (score_array[:, 0]) Stdpos= Float ('%.1f'%Stdpos) Stdneg= NP.STD (score_array[:, 1]) Stdneg= Float ('%.1f'%Stdneg) Score.append ([Pos,neg,avgpos,avgneg,stdpos,stdneg])returnScoredata='use a few days to evaluate, mobile phone is not a card, play glory of what is not the problem, charging fast, the battery is big enough, play games can play a few hours, standby should be able to two or three days , great'data2='do not know how to say, really do not like, the voice is small, the new phone to the phone unexpectedly stuck, the original plan to retreat, just a mobile phone fell, and no longer, the feeling will not love, pixel do not know is I do not understand or how to drop the feeling has not z11mini good, hey want me how to evaluate how I like Nubian I'm so disappointed.'Print(Sentiment_score (sentiment_score_list (Data )))Print(Sentiment_score (Sentiment_score_list (data2)))
Introduction to Affective Analysis:
Affective analysis is the analysis of a sentence is very subjective or objective description, the analysis of this sentence is expressed by positive emotions or negative emotions.
Principle
For example, this sentence: "The picture of the mobile phone is very good, the operation is relatively smooth." But the photo shoot is really rotten! The system is not good either. ”
① Emotional Words
To analyze a sentence is positive or negative, the simplest and most basic method is to find the sentence inside the emotional words, positive emotional words such as: praise, good, handy, gorgeous, and negative emotional words such as: poor, rotten, bad, pit dad. A positive word appears + 1, and a negative word appears-1.
There are "good", "fluent" two positive emotional words, "rotten" a negative emotional word. Then its emotional score is 1+1-1+1=2. Obviously this score is unreasonable, the next step to modify it.
② degree Words
"Good", "fluent" and "rotten" have a degree of modifier in front of the word. "Excellent" is better than "better" or "good" emotion, "too bad" is much better than "a bit rotten" emotion. So you need to find the emotional word and look forward to have no degree of modification, and give different degrees a weight value. such as "Pole", "incomparable", "too" will be the emotional score, "more", "still count" on the emotional score, "only", "just" these *0.5. So the emotional score of this sentence is: 4*1+1*2-1*4+1=3
③ Exclamation Sign
Can be found to be too rotten in the back with an exclamation point, exclamation point means strong emotion. Therefore, the exclamation mark can be +2 for emotional value. Then the emotional score of this sentence becomes: 4*1+1*2-1*4-2+1 = 1
④ Negative words
Discerning eye at first glance that the "good" does not mean "good", because there is a "no" word. So when you find an emotional word, you need to go forward and look for negative words. such as "No", "can't" the words. But also to count these negative words appear the number of times, if is singular, the sentiment score is *-1, but if is even, that emotion has not reversed, still is * *. In this sentence, you can see that "good" in front only a "no", so "good" emotional value should be reversed, *-1.
So the exact emotional score of this sentence is: 4*1+1*2-1*4-2+1*-1 = 1
⑤ positive and negative separate
Then, obviously, it can be seen that there is a praise in this sentence, you can not use a score to express its emotional inclination. And the weight of the value of the setting will affect the final emotional score, the sensitivity is too high. So the final correct treatment of this sentence is to draw a positive score of this sentence, a negative score (so negative score is also positive, without using negative numbers). They also represent the emotional inclination of the sentence. So this remark should be "positive score: 6, negative score: 7"
⑥ is based on the emotion of clause
One more step, in detail, a comment on the emotional score is added by different clauses, so to get a comment on the emotional score, you must first calculate the comments in each sentence of the emotional score. This example comment has four clauses, so its structure is as follows ([positive score, negative score]): [[4, 0], [2, 0], [0, 6], [0, 1]]
The above is the use of emotional dictionaries for emotional analysis of the main process, the design of the algorithm will follow this idea to achieve.
Algorithm design
The first step: Read the review data, the comments to the clause.
The second step: find the emotional word of the clause, record positive or negative, and position.
Step three: Look for the degree word before the emotional word, find it and stop searching. Set weights for degree words multiplied by emotional values.
Fourth step: Look for the negative words before the emotional word, find the total negative words, if the number is odd, multiplied by-1, if an even, multiplied by 1.
The fifth step: Determine whether the end of the clause has an exclamation point, there is an exclamation point to look forward to emotional words, there is a corresponding emotional value of +2.
Sixth step: Calculate the sentiment value of all the clauses in a comment, and record them with an array (list).
Seventh step: Calculate and record the emotional value of all comments.
Eighth step: Calculate the positive affective mean, negative affective mean, positive affective variance and negative affective variance of each comment by clause.
Transferred from: https://zhuanlan.zhihu.com/p/23225934
The original author provides a download link: https://pan.baidu.com/s/1jirooxK Password: 6wq4
The level of forwarding, save the later use, after the test part of the code robustness almost (the comment text is slightly longer, the program error), the need for reinforcement.
"Go" for simple text sentiment analysis with Python