Overview
For the MongoDB storage base Unit Bson Document object, the field value can be a binary type, and based on this feature, we can store the file directly in MongoDB, but there is a limitation because a single Bson object in MongoDB cannot be greater than 16MB. Therefore, if you need to store a larger file, you need to Gridfs. Small file storage System and Gridfs file storage
Let's take a look at the MongoDB storage small File System example:
First use the MongoDB mongofiles file upload:
D:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t13:41:03.283+0800 connected To:localhost
D:\mongodb\server\3.2\bin>mongofiles.exe put E:\deliveryTask.doc
2017-03-06t13:41:23.535+0800 Connected To:localhost
added file:e:\deliverytask.doc
d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t13:41:30.114+0800 connected to:localhost
E:\deliveryTask.doc 2971
To view file storage through the MONGOs command:
> Use test
switched to DB Test
> Show Collections
fs.chunks
fs.files
Restaurants
User
> Db.fs.files.find ()
{"_id": ObjectId ("58bcf683afa0fa20bc854a2b"), "chunksize": 261120, "Uploaddat
E": Isodate ("2017-03-06t05 : 41:23.604z ")," Length ": 2971," MD5 ":" 5434b8033062
99fff57c8a54d3adf78b "," filename ":" E:\\deliverytask.doc "}
You can see the file uploaded successfully.
Because this chapter mainly involves the partial theory as well as the Operation dimension practice, does not involve the concrete code development realization temporarily (the concrete code will take the Java as the example in the later section to introduce).
Upload a file larger than 16MB try:
D:\mongodb\server\3.2\bin>mongofiles.exe put E:\synch.rar
2017-03-06t14:33:11.028+0800 connected to: localhost
added file:e:\synch.rar
d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t14 : 33:15.265+0800 connected to:localhost
E:\deliveryTask.doc 2971
E:\synch.rar 24183487
To view file storage through the MONGOs command:
> Db.fs.files.find ()
{"_id": ObjectId ("58bcf683afa0fa20bc854a2b"), "chunksize": 261120, "uploaddate": isodate ("2017-03-06t05:41:23.604z"), "Length": 2971, "MD5": "5434b803306299fff57c8a54d3adf78b", "filename": "e:\\ Deliverytask.doc "}
{" _id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," chunksize ": 261120," uploaddate ": Isodate (" 2017-03-06t06:33:12.013z ")," Length ": 24183487," MD5 ":" bbfe4d8579372aa0729726185997e908 "," filename ":" e:\\ Synch.rar "}
It worked, too.
View chunks:
> Db.fs.chunks.find ({},{data:0}) {"_id": ObjectId ("58bcf683afa0fa20bc854a2c"), "files_id": ObjectId (" 58bcf683afa0fa20bc854a2b ")," n ": 0} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2d ")," files_id ": ObjectId (" 58bd02a7afa0 FA21D4A14B2C ")," n ": 0} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2e ")," files_id ": ObjectId (" 58bd02a7afa0fa21d4a14b2c
")," n ": 1} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b2f ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 2} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b30"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 3} {"_id": Ob Jectid ("58bd02a7afa0fa21d4a14b31"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 4} {"_id": ObjectId ("58bd 02a7afa0fa21d4a14b32 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 5} {" _id ": ObjectId (" 58bd02a7afa0fa21 D4a14b33 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," N ": 6} {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b34 ")," files_id ": ObjectId (" 58bd02a7afa0fa21d4a14""), "N": 7} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b35"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": 8} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b36"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "N": 9} {"_id": ObjectId ("58bd02a7afa0fa21d4a14b37"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id": ObjectId ("5 8bd02a7afa0fa21d4a14b38 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": one} {" _id ": ObjectId (" 58bd02a7afa0 Fa21d4a14b39 ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ":" {_id ": ObjectId (" 58bd02a7afa0fa21d4a14b3 A ")," files_id ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b3c ")," Files_ ID ": ObjectId (" 58BD02A7AFA0FA21D4A14B2C ")," n ": {" _id ": ObjectId (" 58bd02a7afa0fa21d4a14b3b ")," files_id ": Object Id ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id": ObjectId ("58bd02a7afa0fa21d4a14b3e"), "files_id": ObjectId ("58bd02a 7AFA0FA21D4A14B2C ")," n ":{"_id": ObjectId ("58bd02a7afa0fa21d4a14b3d"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": {"_id" : ObjectId ("58bd02a7afa0fa21d4a14b3f"), "files_id": ObjectId ("58BD02A7AFA0FA21D4A14B2C"), "n": Type "it" for more
Can see the large file is divided into a lot of chunk, then why upload 16MB or more of the file will be successful, because we are using the GRIDFS system stored files, because we are using mongfiles way to upload files.
The following are query, download, delete operations:
D:\mongodb\server\3.2\bin>mongofiles.exe search rar
2017-03-06t14:45:31.974+0800 connected to: localhost
E:\synch.rar 24183487
d:\mongodb\server\3.2\bin>mongofiles.exe--local D:\mongodb_ Download.rar get E:\synch.rar
2017-03-06t14:47:17.841+0800 connected To:localhost finished to
D : \mongodb_download.rar
d:\mongodb\server\3.2\bin>mongofiles.exe Delete E:\synch.rar
2017-03-06t14 : 47:56.649+0800 connected To:localhost
successfully deleted all instances of ' E:\synch.rar ' from Gridfs
d:\mongodb\server\3.2\bin>mongofiles.exe list
2017-03-06t14:48:03.886+0800 connected to: localhost
E:\deliveryTask.doc 2971
In fact, we can also customize the prefix of the collection, by default FS, or by setting the size of the chunk, the default is 256KB. Summary
So how do you determine which storage scheme to use in a distributed file storage system in a real-world scenario, as follows:
1. For any file uploaded by the user, judge the size on the client;
2. When the file size is less than 16MB, it is stored directly in the MongoDB normal collection
3. When the file size is greater than 16MB, upload to Gridfs, use the collection Fs.files and fs.chunks to save the file
4. When the user downloads the file, and then depending on the size of different files to different sets of the properties of the search
In addition, for the Fs.chunks file we can fragment storage, the key can select the index field {"files_id"}, the field as far as possible to ensure that the file in all the split chunk are on the same slice, Fs.files does not require fragmentation, this collection saves only the file's metadata information, the amount of data is small, and can also set the default block size (256KB)
Note that: Gridfs is not suitable for small file storage, because reading files from GRIDFS involves two query operations, first query Fs.files collection, and then query Fs.chunks collection, chunks merged to get the entire file.
Another point to note is that the file block size is 256KB, and the fragmented block size defaults to 64MB, do not confuse.