概述
對於MongoDB的儲存基本單元BSON文檔對象,欄位值可以是二進位類型,基於此特點,我們可以直接在MongoDB中隱藏檔,但是有一個限制,由於MongoDB中單個BSON對象不能大於16MB,故而如果需要儲存更大的檔案,就需要GridFS了。 小檔案儲存體系統與GridFS檔案儲存體
我們先看下MongoDB儲存小檔案系統的例子:
先採用MongoDB的mongofiles執行檔案上傳:
D:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T13:41:03.283+0800 connected to: localhostD:\MongoDB\Server\3.2\bin>mongofiles.exe put E:\deliveryTask.doc2017-03-06T13:41:23.535+0800 connected to: localhostadded file: E:\deliveryTask.docD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T13:41:30.114+0800 connected to: localhostE:\deliveryTask.doc 2971
通過mongos命令查看檔案儲存體情況:
> use testswitched to db test> show collectionsfs.chunksfs.filesrestaurantsuser> db.fs.files.find(){ "_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T05:41:23.604Z"), "length" : 2971, "md5" : "5434b803306299fff57c8a54d3adf78b", "filename" : "E:\\deliveryTask.doc" }
可以看到檔案上傳成功了
由於本章節主要涉及到的是部分理論以及營運實踐,暫不涉及具體代碼的開發實現(具體代碼將以Java為例在後面章節中介紹)。
上傳一個大於16MB的檔案試一試:
D:\MongoDB\Server\3.2\bin>mongofiles.exe put E:\synch.rar2017-03-06T14:33:11.028+0800 connected to: localhostadded file: E:\synch.rarD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T14:33:15.265+0800 connected to: localhostE:\deliveryTask.doc 2971E:\synch.rar 24183487
通過mongos命令查看檔案儲存體情況:
> db.fs.files.find(){ "_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T05:41:23.604Z"), "length" : 2971, "md5" : "5434b803306299fff57c8a54d3adf78b", "filename" : "E:\\deliveryTask.doc" }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T06:33:12.013Z"), "length" : 24183487, "md5" : "bbfe4d8579372aa0729726185997e908", "filename" : "E:\\synch.rar" }
也成功了,
查看chunks:
> db.fs.chunks.find({},{data:0}){ "_id" : ObjectId("58bcf683afa0fa20bc854a2c"), "files_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "n" : 0 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2d"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 0 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2e"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 1 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2f"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 2 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b30"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 3 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b31"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 4 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b32"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 5 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b33"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 6 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b34"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 7 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b35"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 8 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b36"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 9 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b37"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 10 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b38"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 11 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b39"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 12 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3a"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 13 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3c"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 15 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3b"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 14 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3e"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 17 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3d"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 16 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3f"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 18 }Type "it" for more
可以看到大檔案被分成了好多個chunk,那麼為什麼上傳16MB以上的檔案也會成功呢,因為我們採用的就是GridFS系統儲存的檔案,因為我們採用的是mongfiles方式上傳的檔案。
下面分別是查詢、下載、刪除操作:
D:\MongoDB\Server\3.2\bin>mongofiles.exe search rar2017-03-06T14:45:31.974+0800 connected to: localhostE:\synch.rar 24183487D:\MongoDB\Server\3.2\bin>mongofiles.exe --local D:\mongodb_download.rar get E:\synch.rar2017-03-06T14:47:17.841+0800 connected to: localhostfinished writing to D:\mongodb_download.rarD:\MongoDB\Server\3.2\bin>mongofiles.exe delete E:\synch.rar2017-03-06T14:47:56.649+0800 connected to: localhostsuccessfully deleted all instances of 'E:\synch.rar' from GridFSD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T14:48:03.886+0800 connected to: localhostE:\deliveryTask.doc 2971
實際上,我們還可以自訂集合的首碼,預設是fs,或者設定chunk的大小,預設是256KB。 總結
那麼在實際情境的分布式檔案儲存體系統中如何確定改用哪種儲存方案呢,可以採用如下方式:
1. 對於使用者上傳的任何檔案,在用戶端進行大小判斷;
2. 當檔案大小小於16MB時,則直接儲存到MOngoDB普通集合中
3. 當檔案大小大於16MB時,上傳到GridFS中,利用集合fs.files以及fs.chunks來儲存檔案
4. 當使用者下載檔案時,再根據不同檔案的大小屬性不同到不同的集合中尋找
另外,對於fs.chunks檔案我們可以分區儲存,片鍵可以選擇索引欄位{“files_id”},該欄位盡量保證了此檔案在所有被分割的chunk都在同一個片上,fs.files不需要分區,此集合只儲存檔案的中繼資料資訊,資料量不大,同時還可以設定預設塊大小(256KB)
需要注意的是:GridFS並不適合小檔案儲存體,因為從GridFS中讀取檔案涉及到兩次查詢操作,先查詢fs.files集合,再查詢fs.chunks集合,chunks合并後擷取整個檔案。
需要注意的另一點是:檔案分塊大小為256KB,而分區的塊大小預設64MB,不要搞混了。