1. Analysis
Building a word cloud requires:
- Raw materials, articles and other content
- Word Segmentation for Content
- Build a word-cloud-based tool for post-segmentation content
- Save as Picture
2. The main modules needed
- Jieba Chinese participle
- Wordcloud Building a word cloud
3. Module principle
The realization principle of Wordcloud
- Text preprocessing
- Word Frequency statistics
- Color rendering of high-frequency words in picture form
The realization principle of Jieba
- Chinese word segmentation (with multiple modes) "Details"
4. English word Cloud
English participle and build word cloud only need Wordcloud module
The specific implementation is as follows:
1 fromWordcloudImportWordcloud2 3String ='importance of relative word frequencies for font-size. With relative_scaling=0, only word-ranks is considered. With Relative_scaling=1, a word, which is twice as frequent would have twice the size. If you want to consider the word frequencies and is only their rank, relative_scaling around. 5 often looks good.'4Font = R'C:\Windows\Fonts\FZSTK.TTF'5WC = Wordcloud (Font_path=font,#If it is Chinese must add this, otherwise it will be displayed as a box6Background_color=' White',7width=1000,8height=800,9 ). Generate (String)TenWc.to_file ('Ss.png')#Save Picture
5. Chinese participle
The specific implementation is as follows:
1 Import 2 cut = jieba.cut (text) #text for the string you need to participle/sentence 3' '. Join (cut) # connect separate words with spaces
6. Chinese word Cloud
Chinese word cloud requires Jieba and Wordcloud modules
The specific implementation is as follows:
1 ImportJieba2 fromWordcloudImportWordcloud3 fromPILImportImage4 ImportNumPy as NP5 6Font ='Hwkt.ttf'7Content = (Open ('Job Requirements-txt','R', encoding='Utf-8') . Read ()8Cut =jieba.cut (content)9Cut_content =' '. Join (cut)Tenimg = Image.open ('22.png')#what pictures to display OneImg_array = Np.array (IMG)#convert a picture to an array A -WC =Wordcloud ( -Background_color=' White', theMask=img_array,#if not, the default picture is generated -Font_path=font#Chinese word segmentation must have Chinese font settings - ) -Wc.generate_from_text (cut_content)#Drawing Pictures +Wc.to_file ('New.png')#Save Picture
7. Achieving results
The English word cloud achieves the following results:
The Chinese word cloud achieves the following results:
Python Word Cloud "Chinese/English" small white Simple Introductory tutorial