Python Word Cloud "Chinese/English" small white Simple Introductory tutorial

Source: Internet
Author: User

1. Analysis

Building a word cloud requires:

    • Raw materials, articles and other content
    • Word Segmentation for Content
    • Build a word-cloud-based tool for post-segmentation content
    • Save as Picture

2. The main modules needed

    • Jieba Chinese participle
    • Wordcloud Building a word cloud

3. Module principle

The realization principle of Wordcloud

    • Text preprocessing
    • Word Frequency statistics
    • Color rendering of high-frequency words in picture form

The realization principle of Jieba

    • Chinese word segmentation (with multiple modes) "Details"

4. English word Cloud

English participle and build word cloud only need Wordcloud module

The specific implementation is as follows:

1  fromWordcloudImportWordcloud2  3String ='importance of relative word frequencies for font-size. With relative_scaling=0, only word-ranks is considered. With Relative_scaling=1, a word, which is twice as frequent would have twice the size. If you want to consider the word frequencies and is only their rank, relative_scaling around. 5 often looks good.'4Font = R'C:\Windows\Fonts\FZSTK.TTF'5WC = Wordcloud (Font_path=font,#If it is Chinese must add this, otherwise it will be displayed as a box6Background_color=' White',7width=1000,8height=800,9 ). Generate (String)TenWc.to_file ('Ss.png')#Save Picture

5. Chinese participle

The specific implementation is as follows:

1 Import  2 cut = jieba.cut (text)  #text for the string you need to participle/sentence 3'  '. Join (cut)  # connect separate words with spaces

6. Chinese word Cloud

Chinese word cloud requires Jieba and Wordcloud modules

The specific implementation is as follows:

1 ImportJieba2  fromWordcloudImportWordcloud3  fromPILImportImage4 ImportNumPy as NP5 6Font ='Hwkt.ttf'7Content = (Open ('Job Requirements-txt','R', encoding='Utf-8') . Read ()8Cut =jieba.cut (content)9Cut_content =' '. Join (cut)Tenimg = Image.open ('22.png')#what pictures to display OneImg_array = Np.array (IMG)#convert a picture to an array A  -WC =Wordcloud ( -Background_color=' White', theMask=img_array,#if not, the default picture is generated -Font_path=font#Chinese word segmentation must have Chinese font settings - ) -Wc.generate_from_text (cut_content)#Drawing Pictures +Wc.to_file ('New.png')#Save Picture

7. Achieving results

The English word cloud achieves the following results:

The Chinese word cloud achieves the following results:

Python Word Cloud "Chinese/English" small white Simple Introductory tutorial

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.