Gridfs Introduction
MongoDB documents are stored in Bson format and support binary data types when we store binary format data directly into MongoDB's documentation. However, when files are too large, such as samples and videos, each document is limited in length, so MongoDB provides a canonical--gridfs for handling large files.
Gridfs Implementation Principle
in the Gridfs database, by default, Fs.chunks and Fs.files are used to store the file, where the Fs.files collection holds the file's information, fs.chunks the data that holds the file, One of the records in a Fs.files collection is as follows: A file information is shown below.
<pre name= "Code" class= "JavaScript" >{"_id": ObjectId ("4f4608844f9b855c6c35e298"), //Unique ID, can be a user-defined type " FileName ":" CPU.txt ", //filename" Length ": 778, //File Length" chunkSize ": 262144, //chunk size" uploaddate ": Isodate (" 2012-02-23t09:36:04.593z "),//upload Time" MD5 ":" E2C789B036CFB3B848AE39A24E795CA6 ", //MD5 value of the file" ContentType ":" text/ Plain " //file MIME type" meta ": null//File other information, default is no" meta "this key, the user can define themselves as any Bson object}
Corresponding to the chunk in fs.chunks (Chinese meaning data block), as follows:
{"_id": ObjectId ("4f4608844f9b855c6c35e299"), //chunk id "files_id": ObjectId ("4f4608844f9b855c6c35e298"),// The ID of the file, corresponding to the object in the Fs.files, is equivalent to the foreign key "n" of the Fs.files collection: 0, //The number of chunk blocks of the file, if the file is greater than chunksize, it will be split into multiple chunk blocks "data": Bindata (0 , "QGV ...")//File binary data, here omit the specific content}
The default size is 256k, so the file into the GRIDFS process, if the file is larger than chunksize, then the file is divided into multiple chunk, then the chunk saved in Fs.chunks, and finally the file information into Fs.files.
When reading the file, according to the conditions of the query, find a suitable record in the Fs.files, get the value of "_id", and then find all files_id _id chunk according to this value to Fs.funks, and sort by "n", and then read the chunk in sequence. The contents of the data object and revert to the original file.
Note:
1, Gridfs does not automatically process MD5 the same file, for MD5 the same file, if you want to have only one store in Gridfs, to user processing, the calculation of MD5 value is done by the client.
2, because Gridfs in the process of uploading the file is to save the file data to Fs.chunks, Finally, the information of the file is saved to Fs.files, so if it fails to upload the file, it is possible that garbage data will appear in the Fs.chunks, which can be cleaned up periodically.
MongoDB (eight) Mongodb--gridfs storage