MapReduce with MongoDB and Python

來源:互聯網
上載者:User

 1 安裝使用MongoDB

    a) 下載MongoDB, 請注意,32bit只能存2GB的內容(32-bit builds are limited to around 2GB of data)。

    b)配置好mongodb.config, 然後命令列:Mongod.exe --config /path/to/your/mongodb.config就可以了。

    c) 下載pymongo, 後面用python來寫測試程式。

    請參閱:The Little MongoDB Book, (pdf)。

2 MapReduce

Map/reduce in MongoDB is useful for batch processing of data and aggregation operations. It is similar in spirit to using something like Hadoop with all input coming from a collection and output going to a collection. Often, in a situation where you would have used GROUP BY in SQL, map/reduce is the right tool in MongoDB.

參見MongoDB網站上對MapReduce的介紹。Map/reduce 流程如下:

 

3 例子

以單詞統計為例說明。輸入文本是Obama的演講詞,可以看看裡面裡面單詞的使用頻率。如:

 

MongoDB 運行用戶端用JS指令碼。

Map程式為:

Reduce程式為:

 

用戶端程式為:

from pymongo import Connection
from pymongo.code import Code


#'''
#Open a connection to MongoDb (localhost)
connection = Connection()
db = connection.test

#Remove any existing data
db.texts.remove()

#Insert the data
lines = open('2009-obama.txt').readlines()
[db.texts.insert({'text': line}) for line in lines]

#Load map and reduce functions
map = Code(open('wordMap.js','r').read())
reduce = Code(open('wordReduce.js','r').read())


#Run the map-reduce query
results = db.texts.map_reduce(map, reduce, "collection_name")

#Print the results
for result in results.find():
print result['_id'] , result['value']['count']

運行結果為:

 

文章代碼可以在這裡下載。

 

參見:MapReduce with MongoDB and Python 以及這裡。

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.