Use Python to play the word cloud

Source: Internet
Author: User

The first step: introduce the relevant library package:

#Coding:utf-8__author__='Administrator'ImportJieba#Word breaker PackageImportNumPy#NumPy Calculation PackageImportCodecs#codecs provides the open method to specify the language encoding of the opened file, which is automatically converted to internal Unicode at read timeImportPandasImportMatplotlib.pyplot as Plt%Matplotlib Inline fromWordcloudImportWordcloud#Word Cloud Pack

Part II: Import a good word for the journey to the TXT file:

File=codecs.open (U"journey to the. txt",'R','Utf-8') Content=File.read () file.close () jieba.load_userdict (U"red mansions participle. txt") Segment=[]segs=jieba.cut (content) forSeginchSegs:ifLen (SEG) >1 andseg!='\ r \ n': Segment.append (SEG)

The third part: Statistical segmentation results and remove the discontinued words:

Segmentdf=pandas. DataFrame ({'segment': Segment}) Segmentdf.head () Stopwords=pandas.read_csv ("Stopwords.txt", index_col=false,quoting=3,sep="\ t", names=['Stopword'])#quoting=3 all not quotedstopwords.head () segmentdf=segmentdf[~SegmentDF.segment.isin (stopwords.stopword)]wystopwords=pandas. Series (['of','its','or','also','Square','in','that','both','because','still','therefore','still','?','the','of the','the','a'                           ,'No',' is','Yes','?',                           ' .','Ah','put','Let','to','towards','is a','in the','more','again',                           'more','than','very','Partial','Don't','Good','can be','will','just','but','son','and','also','All','I'm','his','come to','" "']) SEGMENTDF=segmentdf[~segmentdf.segment.isin (Wystopwords)]

Fourth: Statistical frequency of Word:

Segstat=segmentdf.groupby (by=['segment') ['segment'].agg ({" count ": numpy.size}) Segstat=segstat.reset_index (). Sort (columns=" Count ", ascending=False) Segstat

Fifth step: Display the word cloud

Wordcloud=wordcloud (font_path="simhei.ttf", background_color="Black ")
Wordcloud=wordcloud.fit_words (Segstat.head (+) Itertuples (Index=false))
Plt.imshow (Wordcloud)

Sixth step: Custom Word Cloud shapes

 fromScipy.miscImportImreadImportMatplotlib.pyplot as Plt fromWordcloudImportwordcloud,imagecolorgeneratorbimg=imread ('3.jPG') Wordcloud=wordcloud (background_color=" White", mask=bimg,font_path='C:\Windows\Fonts\simhei.ttf') Wordcloud=wordcloud.fit_words (Segstat.head (39769). Itertuples (index=False)) Bimgcolors=imagecolorgenerator (bimg) Plt.axis ("off") Plt.imshow (Wordcloud.recolor (Color_func=bimgcolors)) Plt.show ()

Use Python to play the word cloud

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.