標籤:興趣 http for .json travel sort soft 伺服器 god
一,問題描述
【使用 unwind unpack Document 裡面的Array中的每個元素,然後使用 group 分組統計,最後使用 sort 對分組結果排序】
從 images.json 檔案中匯入資料到MongoDB伺服器
mongoimport --drop -d test -c images images.json
其中Document的樣本如下:
> db.images.find(){ "_id" : 3, "height" : 480, "width" : 640, "tags" : [ "kittens", "travel" ] }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : [ "cats", "sunrises", "kittens", "travel", "vacation", "work" ] }{ "_id" : 0, "height" : 480, "width" : 640, "tags" : [ "dogs", "work" ] }{ "_id" : 6, "height" : 480, "width" : 640, "tags" : [ "work" ] }{ "_id" : 4, "height" : 480, "width" : 640, "tags" : [ "dogs", "sunrises", "kittens", "travel" ] }{ "_id" : 5, "height" : 480, "width" : 640, "tags" : [ "dogs", "cats", "sunrises", "kittens", "work" ] }{ "_id" : 7, "height" : 480, "width" : 640, "tags" : [ "dogs", "sunrises" ] }{ "_id" : 8, "height" : 480, "width" : 640, "tags" : [ "dogs", "cats", "sunrises", "kittens", "travel" ] }
現在要統計: 所有Document中的 tags 數組裡面的每個元素 出現的次數。即:"kittens"出現了多少次?"travel"出現了多少次?"dogs"出現了多少次?……
二,實現步驟
使用MongoDB的Aggregate操作進行實現
①使用 unwind 分解 tags 數組,得到的結果如下:
> db.images.aggregate(... [... {$unwind:"$tags"}... ]){ "_id" : 3, "height" : 480, "width" : 640, "tags" : "kittens" }{ "_id" : 3, "height" : 480, "width" : 640, "tags" : "travel" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "cats" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "sunrises" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "kittens" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "travel" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "vacation" }{ "_id" : 1, "height" : 480, "width" : 640, "tags" : "work" }{ "_id" : 0, "height" : 480, "width" : 640, "tags" : "dogs" }{ "_id" : 0, "height" : 480, "width" : 640, "tags" : "work" }{ "_id" : 6, "height" : 480, "width" : 640, "tags" : "work" }{ "_id" : 4, "height" : 480, "width" : 640, "tags" : "dogs" }{ "_id" : 4, "height" : 480, "width" : 640, "tags" : "sunrises" }..........
②將分解後的每個 tag 進行 group 操作
對於group操作而言,_id 指定了 分組 的欄位(對哪個欄位進行 group by 操作),分組操作之後產生的結果由 num_of_tag 欄位標識
> db.images.aggregate(... [... {$unwind:"$tags"},... {$group:{_id:"$tags",num_of_tag:{$sum:1}}}... ]... ){ "_id" : "dogs", "num_of_tag" : 49921 }{ "_id" : "work", "num_of_tag" : 50070 }{ "_id" : "vacation", "num_of_tag" : 50036 }{ "_id" : "travel", "num_of_tag" : 49977 }{ "_id" : "kittens", "num_of_tag" : 49932 }{ "_id" : "sunrises", "num_of_tag" : 49887 }{ "_id" : "cats", "num_of_tag" : 49772 }
③使用 project 去掉不感興趣的 _id 欄位(其實這裡是將 _id 欄位名 替換為 tags 欄位名)(這一步可忽略)
project操作,_id:0 表示去掉_id 欄位;tags:"$_id",將 _id 欄位值 使用tags 欄位標識;num_of_tag:1 保留 num_of_tag 欄位
> db.images.aggregate( [ {$unwind:"$tags"},{$group:{_id:"$tags",num_of_tag:{$sum:1}}},{$project:{_id:0,tags:"$_id",num_of_tag:1}} ]){ "num_of_tag" : 49921, "tags" : "dogs" }{ "num_of_tag" : 50070, "tags" : "work" }{ "num_of_tag" : 50036, "tags" : "vacation" }{ "num_of_tag" : 49977, "tags" : "travel" }{ "num_of_tag" : 49932, "tags" : "kittens" }{ "num_of_tag" : 49887, "tags" : "sunrises" }{ "num_of_tag" : 49772, "tags" : "cats" }
④使用 sort 對 num_of_tag 欄位排序
> db.images.aggregate( [ {$unwind:"$tags"},{$group:{_id:"$tags",num_of_tag:{$sum:1}}},{$project:{_id:0,tags:"$_id",num_of_tag:1}},{$sort:{num_of_tag:-1}} ]){ "num_of_tag" : 50070, "tags" : "work" }{ "num_of_tag" : 50036, "tags" : "vacation" }{ "num_of_tag" : 49977, "tags" : "travel" }{ "num_of_tag" : 49932, "tags" : "kittens" }{ "num_of_tag" : 49921, "tags" : "dogs" }{ "num_of_tag" : 49887, "tags" : "sunrises" }{ "num_of_tag" : 49772, "tags" : "cats" }
三,總結
本文是MongoDB University M101課程 For Java Developers中的一次作業。結合Google搜尋和MongoDB的官方文檔,很容易就能實現MongoDB的各種組合查詢。
原文:http://www.cnblogs.com/hapjin/p/7944404.html
MongoDB統計文檔(Document)的數組(Array)中的各個元素出現的次數