Python makes the word cloud (Wordcloud)

Source: Internet
Author: User

Python makes the word cloud (wordcloud) 1. Installation
 某个教程给出的方法,到[这里][1]下载相应的wordcolud,然后到相应目录pip安装。  其实直接
PIP INsTaLL WoRDCLouD

OK, go to Python. Import Wordcloud success.

# #2. Brief description of the document

There are 3 main functions that can be seen in the document, and the Wordcloud modules and related functions are mainly introduced.

    1. Wordcloud ()
Class Wordcloud. Wordcloud (Font_path=none,width= -, height= $, margin=2, Ranks_only=none, prefer_horizontal=0.9, Mask=none, scale=1, Color_func=none, max_words= $, min_font_size=4, Stopwords=none, Random_state=none, background_color=' Black ', Max_font_size=none, font_step=1, mode=' RGB ', relative_scaling=0.5, Regexp=none, Collocations=true, Colormap=none, Normalize_plurals=true)

Font_path: Font location, Chinese need to make some.
Prefer_horizontal:float, the horizontal direction of the fitting number, if less than 1, once the horizontal direction is not appropriate to rotate the word. The meaning is that the word cloud algorithm horizontal word and vertical direction word of a quantity measure.
Mask: Control the background of the word cloud. Nd-array or None (default=none) If it is empty, use the width and height parameters. Otherwise use mask as the background.
Scale: Zoom picture
Max_words: The largest word displayed
Stopwords: Stop Word
Relative_scaling: This is interesting, if true, the size of the font is related to the word order. False, the font size is related to the word cloud frequency.

    1. Related functions

Fit_words (frequencies) generates word clouds based on word and frequency
Generate (Tex) generates word clouds directly from text, in English only
Generate_from_frequencies (frequencies, max_font_size=none) to generate word clouds based on word and frequency, you can specify the maximum number
Generate_from_text () generates a word cloud directly from the text. in English
Process_text (text) returns {Word,int} based on the statistic number of words generated by text, removing the stop word. Limited to English

With regard to the fit_words parameter problem,

Let's pass a tuple that contains word and frequency, and actually I do this when the argument is wrong, take a look at the source code

 def fit_words (self, frequencies): " " " Create A word_cloud from words and frequencies. Alias to Generate_from_frequencies. Parameters----------Frequencies:array of tuples A tuple contains the word and its frequency. Returns-------Self " "  return self.generate_from_frequencies (frequencies) 

The parameter description still says "a tuple contains the word and its frequency.", and went to call the self.generate_from_frequencies (frequencies),

def generate_from_frequencies (self, frequencies, Max_font_size=none):"""Create a word_cloud from words and frequencies. Parameters----------frequencies:dict from string to float A contains words and associated fre        Quency.        Max_font_size:int Use this font-size instead of self.max_font_size Returns-------Self """# Make sure frequencies is sorted and normalized frequencies = sorted (Frequencies.items (), Key=item1, Reverse=true ) frequencies = Frequencies[:self.max_words] # largest entry'll be1Max_frequency =float(frequencies[0][4]) frequencies = [(Word, freq/max_frequency) forWord, FreqinchFrequencies]

This time the argument is "dict from string to float", and the inside of that list generation is equivalent to generating a new frequencies, the new frequencies is an array of tuple. So we still have to pass the dictionary form. Just inside the function becomes an array of tuple .... Years of unfinished repair?

3. Knowledge of education level generate word Cloud instance
#coding =utf-8#导入wordcloud模块和matplotlib模块from Wordcloud Import Wordcloud,imagecolorgeneratorimport matplotlib. PyplotAs Pltfrom scipy. MiscImport Imreadimport Jiebaimport Jieba. Analysecontent= (","). Join (data2[' education experience '].values.tolist ()) #dataframe格式数据tags = Jieba. Analyse. Extract_tags(content, topk= $, withweight=false) Text =" ". Join (tags) Print(tags) #读入背景图片bj_pic =imread (' 1.png ') #生成词云 (usually the font path is set at c:\\windows\\fonts\\ and can be downloaded by itself)Font=r' C:\\windows\\fonts\\stfangso.ttf '#不加这一句显示口字形乱码""Error Wordcloud=wordcloud (Mask=bj_pic,background_color=' White ', font_path=font,scale=0.5). Generate_from_text (text) #直接根据文本生成 Word cloud plt.imshow (wordcloud) Plt.axis (' off ') Plt.show () Wordcloud.to_file (' test2.jpg ')


You can try to stop the word, university, or directly in tags to remove unwanted. The picture is very small, with scale=0.5

Tried a bit of fit_words, must pass in the dictionary form.!!!

Wordcloud = Wordcloud (Mask=bj_pic,background_color=' White ', font_path=font,scale=3.5). Fit_words ({"SB":3,"I Am":4,"Damn":Ten}) Plt.imshow (Wordcloud) Plt.axis (' off ') Plt.show ()

Python makes the word cloud (Wordcloud)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.