Using Python to make simple Chinese word cloud __python

Source: Internet
Author: User
Tags jupyter notebook

Preface

In the previous article, we explained the installation of Anaconda in the Ubuntu environment and made a simple English word cloud.
Some students may try to change the article into Chinese, make the Chinese word cloud. I think we're going to get the results.

There are many differences between Chinese and English in coding, and when we do the English word cloud, in an article, the words are separated by a space,
but the Chinese language does not use spaces. All of them have the picture above. So how to Chinese participle? We need to use a tool, Jieba (stuttering)

preparatory work

1. Text data, as an object of analysis. This is a must, this time I chose the last relevant text data
I have a dream of the Chinese version. Make the Dream.txt file, and keep it in the same directory as the code.
2. Anaconda Tool Set, the last article has been about how to install and use, this is not long-winded.
3. Worldcloud, the Python Extension toolkit for the lyrics cloud.
4. Jieba      expansion pack for Chinese word segmentation.
5. Simsum.tty Chinese font pack for display in Chinese.

First Step

Open the terminal and enter the following command to install the Jieba expansion pack

Pip install Jieba   //installation is simple, there's nothing to say

Continue typing in terminal

Jupyter notebook     //Open the Code Editor and switch to the directory where Dream.txt is stored

If you did the last one because of the word cloud, then you can use the last directory, and in the Code Editor, enter the following code

File = open (' Dream.txt ')
text = File.read ()
text

The text appears to indicate that there is no problem with the textual data and that it can be opened normally.

participle

Between the second and third lines, insert the following code to do the word breaker

Import Jieba//imports Jieba participle
text = '. Join (Jieba.cut (text))//Chinese participle

You will see the following picture to show that the participle was successful

Word Cloud Generation

Comment out the last text of the code to prevent interference. Continue typing in the editor

From Wordcloud import wordcloud
Wordcloud = Wordcloud (). Generate (text)

At this point, if there is no error, there is no output, it is not the word cloud has been analyzed completed.
Not, but this time is not the same as the last English, because we want to export the Chinese word cloud, so we
Prepare the Simsum.tty font pack, put it in the same directory as the code, and then enter the following code in the Code Editor:

From Wordcloud import wordcloud
Wordcloud = Wordcloud (font_path= "Simsun.ttf"). Generate (MyText)

There is still no output, but this is not far from success.

Word Cloud Output

Enter the following code in the Code Editor:

%pylab inline
import matplotlib.pyplot as Plt
plt.imshow (Wordcloud, interpolation= ' bilinear ')
Plt.axis ( "Off")

You will see the following results, please disregard the warning

        A simple Chinese word cloud is done ...

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.