MongoDB的分布式檔案儲存體系統

來源:互聯網
上載者:User
概述

對於MongoDB的儲存基本單元BSON文檔對象,欄位值可以是二進位類型,基於此特點,我們可以直接在MongoDB中隱藏檔,但是有一個限制,由於MongoDB中單個BSON對象不能大於16MB,故而如果需要儲存更大的檔案,就需要GridFS了。 小檔案儲存體系統與GridFS檔案儲存體

我們先看下MongoDB儲存小檔案系統的例子:

先採用MongoDB的mongofiles執行檔案上傳:

D:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T13:41:03.283+0800    connected to: localhostD:\MongoDB\Server\3.2\bin>mongofiles.exe put E:\deliveryTask.doc2017-03-06T13:41:23.535+0800    connected to: localhostadded file: E:\deliveryTask.docD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T13:41:30.114+0800    connected to: localhostE:\deliveryTask.doc     2971

通過mongos命令查看檔案儲存體情況:

> use testswitched to db test> show collectionsfs.chunksfs.filesrestaurantsuser> db.fs.files.find(){ "_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T05:41:23.604Z"), "length" : 2971, "md5" : "5434b803306299fff57c8a54d3adf78b", "filename" : "E:\\deliveryTask.doc" }

可以看到檔案上傳成功了

由於本章節主要涉及到的是部分理論以及營運實踐,暫不涉及具體代碼的開發實現(具體代碼將以Java為例在後面章節中介紹)。

上傳一個大於16MB的檔案試一試:

D:\MongoDB\Server\3.2\bin>mongofiles.exe put E:\synch.rar2017-03-06T14:33:11.028+0800    connected to: localhostadded file: E:\synch.rarD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T14:33:15.265+0800    connected to: localhostE:\deliveryTask.doc     2971E:\synch.rar    24183487

通過mongos命令查看檔案儲存體情況:

> db.fs.files.find(){ "_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T05:41:23.604Z"), "length" : 2971, "md5" : "5434b803306299fff57c8a54d3adf78b", "filename" : "E:\\deliveryTask.doc" }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "chunkSize" : 261120, "uploadDate" : ISODate("2017-03-06T06:33:12.013Z"), "length" : 24183487, "md5" : "bbfe4d8579372aa0729726185997e908", "filename" : "E:\\synch.rar" }

也成功了,

查看chunks:

> db.fs.chunks.find({},{data:0}){ "_id" : ObjectId("58bcf683afa0fa20bc854a2c"), "files_id" : ObjectId("58bcf683afa0fa20bc854a2b"), "n" : 0 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2d"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 0 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2e"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 1 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b2f"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 2 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b30"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 3 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b31"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 4 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b32"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 5 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b33"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 6 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b34"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 7 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b35"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 8 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b36"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 9 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b37"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 10 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b38"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 11 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b39"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 12 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3a"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 13 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3c"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 15 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3b"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 14 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3e"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 17 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3d"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 16 }{ "_id" : ObjectId("58bd02a7afa0fa21d4a14b3f"), "files_id" : ObjectId("58bd02a7afa0fa21d4a14b2c"), "n" : 18 }Type "it" for more

可以看到大檔案被分成了好多個chunk,那麼為什麼上傳16MB以上的檔案也會成功呢,因為我們採用的就是GridFS系統儲存的檔案,因為我們採用的是mongfiles方式上傳的檔案。

下面分別是查詢、下載、刪除操作:

D:\MongoDB\Server\3.2\bin>mongofiles.exe search rar2017-03-06T14:45:31.974+0800    connected to: localhostE:\synch.rar    24183487D:\MongoDB\Server\3.2\bin>mongofiles.exe --local D:\mongodb_download.rar get E:\synch.rar2017-03-06T14:47:17.841+0800    connected to: localhostfinished writing to D:\mongodb_download.rarD:\MongoDB\Server\3.2\bin>mongofiles.exe delete E:\synch.rar2017-03-06T14:47:56.649+0800    connected to: localhostsuccessfully deleted all instances of 'E:\synch.rar' from GridFSD:\MongoDB\Server\3.2\bin>mongofiles.exe list2017-03-06T14:48:03.886+0800    connected to: localhostE:\deliveryTask.doc     2971

實際上,我們還可以自訂集合的首碼,預設是fs,或者設定chunk的大小,預設是256KB。 總結

那麼在實際情境的分布式檔案儲存體系統中如何確定改用哪種儲存方案呢,可以採用如下方式:
1. 對於使用者上傳的任何檔案,在用戶端進行大小判斷;
2. 當檔案大小小於16MB時,則直接儲存到MOngoDB普通集合中
3. 當檔案大小大於16MB時,上傳到GridFS中,利用集合fs.files以及fs.chunks來儲存檔案
4. 當使用者下載檔案時,再根據不同檔案的大小屬性不同到不同的集合中尋找

另外,對於fs.chunks檔案我們可以分區儲存,片鍵可以選擇索引欄位{“files_id”},該欄位盡量保證了此檔案在所有被分割的chunk都在同一個片上,fs.files不需要分區,此集合只儲存檔案的中繼資料資訊,資料量不大,同時還可以設定預設塊大小(256KB)

需要注意的是:GridFS並不適合小檔案儲存體,因為從GridFS中讀取檔案涉及到兩次查詢操作,先查詢fs.files集合,再查詢fs.chunks集合,chunks合并後擷取整個檔案。

需要注意的另一點是:檔案分塊大小為256KB,而分區的塊大小預設64MB,不要搞混了。

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.