Natural Language processing with Python
Chapter 6.2
1 ImportNLTK2 fromNltk.corpusImportNps_chat as NChat3 4 defDialogue_act_features (POST):5features={}6 forWordinchnltk.word_tokenize (POST):7features['contains (%s)'% word.lower ()] =True8 returnfeatures9 Ten deftest_dialogue_act_types (): OnePosts=nchat.xml_posts () [: 10000] AFeaturesets = [(Dialogue_act_features (Post.text), Post.get ('class')) - forPostinchPosts] -Size=int (Len (featuresets) *0.1) theTrain_set, Test_set =Featuresets[size:],featuresets[:size] -Classifier =NLTK. Naivebayesclassifier.train (Train_set) - Printnltk.classify.accuracy (Classifier,test_set) -Classifier.show_most_informative_features (5)
Operation Result:
0.668
Most informative Features
Contains (HI) = True Greet:system = 408.2:1
Contains (>) = True other:system = 384.6:1
Contains (empty) = True Other:system = 339.4:1
Contains (part) = True System:statem = 302:1
Contains (NO) = True Nanswe:system = 262.3:1
Identifying Dialogue Act Type