Word Cloud Python

最後更新：2018-12-04 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

Word Cloud is popular in many web pages.

Today, I find one simple way to generate word cloud by python.

Here, we use pytagcloud package. It is can be find by Google.

We used :

python setup.py install

to install this package in python2.7

In the next, we may find this package also need pygame and jsonpickle the two packages.

The same way to install these two packages.

Then we can generate the word cloud for one string by the follow:

tags = make_tags(get_tag_counts(contenct),maxsize=120)                imagename = 'H:/Project/NextBuildData/Imageresult/'+venue['Venue_id']+'.png'        print imagename        create_tag_image(tags, imagename, background=(0, 0, 0), fontname='Lobster')

Here, The contenct is the text or str data

get_tag_counts is used to count the number of word in the string data.

Make_tag is used to generate the struct for the create word cloud

In this function, maxsize is the word size, also have one parameter minsize is the min size for each word.

Finally, creat_tag_image is used to generate the word cloud by the "tags".

imagename is the output image name.

background is the background color.

fontname is the word font in the output image.

My code is :

'''Created on Mar 24, 2013@author: Yang'''import pytagcloudimport nltkimport nltk.bookimport sys, os, statfrom nltk import FreqDistfrom nltk.corpus import wordnet#from nltk.corpus import wordnet.synsetsfrom nltk.stem.lancaster import LancasterStemmerimport enchantfrom nltk.tag.simplify import simplify_wsj_tagimport pymongoimport sys, os, statfrom pymongo import Connectionfrom pytagcloud import create_tag_image, make_tagsfrom pytagcloud.lang.counter import get_tag_countsimport urllib2import jsonimport pymongoimport sys, os, statfrom pymongo import Connectionc = Connection('localhost', 27017)db = c.FourS2finf = db.FourInformationfpho = db.FourPhotosftip = db.FourTipsVenueList = db.Venuelistcount = 1for venue in VenueList.find():    try:        if count<=1991:            count = count+1            continue        contenct = ''        vneuetips = venue['Tips']        for tip in vneuetips:            contenct = contenct+' '+vneuetips[tip]['Tip']                    tokens = nltk.word_tokenize(contenct)         contenct = ''        tempword = []        for i in tokens:            temp = i.lower()            st = LancasterStemmer()            temp = st.stem(temp)            tempword.append(temp)                #print tempword    #    fdist = FreqDist(tempword)    #    v = fdist.keys()        d = enchant.Dict("en_US")    #    vtemp = v    #    print         for sample in tempword:            texttemp = nltk.word_tokenize(sample)            tags = nltk.pos_tag(texttemp)            s = [(word, simplify_wsj_tag(tag)) for word, tag in tags]            atrr = s[0][1]            tempnum = len(sample)            if tempnum>2:                if sample not in nltk.corpus.stopwords.words('english'):                    if sample.isalpha():                        if d.check(sample):                            #if atrr=='N' or atrr=='NP':                            contenct = contenct+' '+sample                                                    #YOUR_TEXT = "A tag cloud is a visual representation for text data, typically\    #used to depict keyword metadata on websites, or to visualize free form text."            tags = make_tags(get_tag_counts(contenct),maxsize=120)                imagename = 'H:/Project/NextBuildData/Imageresult/'+venue['Venue_id']+'.png'        print imagename        create_tag_image(tags, imagename, background=(0, 0, 0), fontname='Lobster')    except:        pass

I read text data from Mongodb. and also to remove the stop word and alpha.

Friends can inference this code.

The demo image is follow:

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More