The final review compared busy time to write Scrapy framework use, today describes how to use Python to generate word cloud, although there are many word cloud generation tools on the Web, but their own python to write is not a more fulfilling sense.
Today to generate is inspirational song word cloud, Baidu Library inside find 20 of, such as "stubborn", the sky is, what everybody familiar.
The Python library to be used has jieba (a Chinese word breaker), wordcould, Matplotlib, PIL, and NumPy.
The first thing we need to do is read the lyrics. I have the lyrics in the text of the inspirational song in the file directory.
Now to read him
#encoding =gbklyric= ' F=open ('./inspirational song lyrics. txt ', ' r ') for I in F: Lyric+=f.read ()
Added #encoding=gbk to prevent back operation error Syntaxerror:non-utf-8 code starting with ' \xc0 '
Then we use Jieba participle to the song to do Word segmentation to extract the word frequency high
Import Jieba.analyseresult=jieba.analyse.textrank (lyric,topk=50,withweight=true) keywords = dict () for I in Result: keywords[i[0]]=i[1]print (keywords)
Get results:
Then we can generate the word cloud through the Wrodcloud and other libraries.
First, find yourself a picture to create the shape of the word cloud
From PIL import Image,imagesequenceimport numpy as Npimport matplotlib.pyplot as Pltfrom wordcloud import Wordcloud,imagec Olorgeneratorimage= Image.open ('./tim.jpg ') graph = Np.array (Image) WC = Wordcloud (font_path= './fonts/simhei.ttf ', Background_color= ' White ', max_words=50,mask=graph) wc.generate_from_frequencies (keywords) image_color = Imagecolorgenerator (graph) plt.imshow (WC) plt.imshow (Wc.recolor (Color_func=image_color)) Plt.axis ("Off") Plt.show ( )
Save a generated picture
Wc.to_file (' Dream.png ')
Full code:
#encoding =gbkimport jieba.analysefrom PIL Import Image,imagesequenceimport NumPy as Npimport Matplotlib.pyplot as Pltfrom wordcloud import wordcloud,imagecolorgeneratorlyric= ' F=open ('./ Inspirational song lyrics. txt ', ' r ') for I in F:lyric+=f.read () result=jieba.analyse.textrank (lyric,topk=50,withweight=true) keywords = Dict () for i in Result:keywords[i[0]]=i[1]print (keywords) image= image.open ('./tim.jpg ') Graph = np.array (image) WC = Word Cloud (font_path= '/fonts/simhei.ttf ', background_color= ' white ', max_words=50,mask=graph) wc.generate_from_ Frequencies (keywords) image_color = imagecolorgenerator (graph) plt.imshow (WC) plt.imshow (Wc.recolor (color_func= Image_color)) Plt.axis ("Off") Plt.show () wc.to_file (' Dream.png ')