MongoDB Distributed File Storage System

Source: Internet
Author: User
Tags mongodb rar
Overview

For the MongoDB storage base Unit Bson Document object, the field value can be a binary type, and based on this feature, we can store the file directly in MongoDB, but there is a limitation because a single Bson object in MongoDB cannot be greater than 16MB. Therefore, if you need to store a larger file, you need to Gridfs. Small file storage System and Gridfs file storage

Let's take a look at the MongoDB storage small File System example:

First use the MongoDB mongofiles file upload:

D:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t13:41:03.283+0800    connected To:localhost

D:\mongodb\server\3.2\bin>mongofiles.exe put E:\deliveryTask.doc
2017-03-06t13:41:23.535+0800    Connected To:localhost
added file:e:\deliverytask.doc

d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t13:41:30.114+0800    connected to:localhost
E:\deliveryTask.doc     2971

To view file storage through the MONGOs command:

> Use test
switched to DB Test
> Show Collections
fs.chunks
fs.files
Restaurants
User
> Db.fs.files.find ()
{"_id": ObjectId ("58bcf683afa0fa20bc854a2b"), "chunksize": 261120, "Uploaddat
E": Isodate ("2017-03-06t05 : 41:23.604z ")," Length ": 2971," MD5 ":" 5434b8033062
99fff57c8a54d3adf78b "," filename ":" E:\\deliverytask.doc "}

You can see the file uploaded successfully.

Because this chapter mainly involves the partial theory as well as the Operation dimension practice, does not involve the concrete code development realization temporarily (the concrete code will take the Java as the example in the later section to introduce).

Upload a file larger than 16MB try:

D:\mongodb\server\3.2\bin>mongofiles.exe put E:\synch.rar
2017-03-06t14:33:11.028+0800    connected to: localhost
added file:e:\synch.rar

d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t14 : 33:15.265+0800    connected to:localhost
E:\deliveryTask.doc     2971
E:\synch.rar    24183487

To view file storage through the MONGOs command:

> Db.fs.files.find ()
{"_id": ObjectId ("58bcf683afa0fa20bc854a2b"), "chunksize": 261120, "uploaddate": isodate ("2017-03-06t05:41:23.604z"), "Length": 2971, "MD5": "5434b803306299fff57c8a54d3adf78b", "filename": "e:\\ Deliverytask.doc "}
{" _id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," chunksize ": 261120," uploaddate ": Isodate (" 2017-03-06t06:33:12.013z ")," Length ": 24183487," MD5 ":" bbfe4d8579372aa0729726185997e908 "," filename ":" e:\\ Synch.rar "}

It worked, too.

View chunks:

> Db.fs.chunks.find ({},{data:0}) {"_id": ObjectId ("58bcf683afa0fa20bc854a2c"), "files_id": ObjectId (" 58bcf683afa0fa20bc854a2b ")," n ": 0} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2d ")," files_id ": ObjectId (" 58bd02a7afa0 FA21D4A14B2C ")," n ": 0} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2e ")," files_id ": ObjectId (" 58bd02a7afa0fa21d4a14b2c 
")," n ": 1} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2f ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 2} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b30"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 3} {"_id": Ob Jectid ("58bd02a7afa0fa21d4a14b31"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 4} {"_id": ObjectId ("58bd 02a7afa0fa21d4a14b32 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 5} {" _id ": ObjectId (" 58bd02a7afa0fa21 D4a14b33 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 6} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b34 ")," files_id ": ObjectId (" 58bd02a7afa0fa21d4a14""), "N": 7} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b35"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n":  8} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b36"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 9} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b37"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id": ObjectId ("5 8bd02a7afa0fa21d4a14b38 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": one} {" _id ": ObjectId (" 58bd02a7afa0 Fa21d4a14b39 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ":" {_id ": ObjectId (" 58bd02a7afa0fa21d4a14b3 A ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b3c ")," Files_ ID ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b3b ")," files_id ": Object Id ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id": ObjectId ("58bd02a7afa0fa21d4a14b3e"), "files_id": ObjectId ("58bd02a 7AFA0FA21D4A14B2C ")," n ":{"_id": ObjectId ("58bd02a7afa0fa21d4a14b3d"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id" : ObjectId ("58bd02a7afa0fa21d4a14b3f"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": Type "it" for more

Can see the large file is divided into a lot of chunk, then why upload 16MB or more of the file will be successful, because we are using the GRIDFS system stored files, because we are using mongfiles way to upload files.

The following are query, download, delete operations:

D:\mongodb\server\3.2\bin>mongofiles.exe search rar
2017-03-06t14:45:31.974+0800    connected to: localhost
E:\synch.rar    24183487

d:\mongodb\server\3.2\bin>mongofiles.exe--local D:\mongodb_ Download.rar get E:\synch.rar
2017-03-06t14:47:17.841+0800    connected To:localhost finished to
D : \mongodb_download.rar

d:\mongodb\server\3.2\bin>mongofiles.exe Delete E:\synch.rar
2017-03-06t14 : 47:56.649+0800    connected To:localhost
successfully deleted all instances of ' E:\synch.rar ' from Gridfs

d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t14:48:03.886+0800    connected to: localhost
E:\deliveryTask.doc     2971

In fact, we can also customize the prefix of the collection, by default FS, or by setting the size of the chunk, the default is 256KB. Summary

So how do you determine which storage scheme to use in a distributed file storage system in a real-world scenario, as follows:
1. For any file uploaded by the user, judge the size on the client;
2. When the file size is less than 16MB, it is stored directly in the MongoDB normal collection
3. When the file size is greater than 16MB, upload to Gridfs, use the collection Fs.files and fs.chunks to save the file
4. When the user downloads the file, and then depending on the size of different files to different sets of the properties of the search

In addition, for the Fs.chunks file we can fragment storage, the key can select the index field {"files_id"}, the field as far as possible to ensure that the file in all the split chunk are on the same slice, Fs.files does not require fragmentation, this collection saves only the file's metadata information, the amount of data is small, and can also set the default block size (256KB)

Note that: Gridfs is not suitable for small file storage, because reading files from GRIDFS involves two query operations, first query Fs.files collection, and then query Fs.chunks collection, chunks merged to get the entire file.

Another point to note is that the file block size is 256KB, and the fragmented block size defaults to 64MB, do not confuse.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.