Gridfs Introduction
MongoDB documents are stored in the Bson format and support binary data types when we save data in binary format directly to MongoDB documents. But when files are too large, such as pictures and videos, the length of each document is limited, so MongoDB provides a canonical--gridfs for handling large files.
GRIDFS Realization Principle
In the Gridfs database, the default is to use Fs.chunks and fs.files to store the files, where the Fs.files collection holds the file information, fs.chunks the data that holds the file, One of the records in a Fs.files collection is as follows: A file message is shown below.
Copy Code code as follows:
{
' _id ': ObjectId ("4f4608844f9b855c6c35e298"),//unique ID, can be a user-defined type
"FileName": "CPU.txt",//filename
"Length": 778,//File length
"Chunksize": 262144,//chunk size
"Uploaddate": Isodate ("2012-02-23t09:36:04.593z"),//upload time
"MD5": "E2C789B036CFB3B848AE39A24E795CA6",//MD5 value of the file
' ContentType ': ' Text/plain '//MIME type of File
"META": null//File other information, default is no "meta" This key, users can define themselves as arbitrary bson objects
}
Corresponds to the chunk (Chinese meaning block) in Fs.chunks, as follows:
Copy Code code as follows:
{
"_id": ObjectId ("4f4608844f9b855c6c35e299"),//chunk ID
"files_id": ObjectId ("4f4608844f9b855c6c35e298"),//ID of file, corresponding to object in Fs.files, foreign key equivalent to Fs.files set
"N": 0,//The first few chunk blocks of the file, if the file is larger than chunksize, it will be split into multiple chunk blocks
"Data": Bindata (0, "QGV ...")//File binary data, here omitted the specific content
}
The default size is 256k, so the file into the GRIDFS process, if the file is greater than chunksize, the file is divided into multiple chunk, and then the chunk saved in the fs.chunks, and finally the file information into the fs.files.
In reading the file, first, according to the conditions of the query, in the Fs.files find a suitable record, get the "_id" value, and then according to the value of the fs.funks to find all files_id _id, and in accordance with "n" sort, and finally read the chunk in sequence " The contents of the data object and revert to the original file.
Note:
1, Gridfs does not automatically handle MD5 the same file, for MD5 the same file, if you want to in Gridfs only one storage, to user processing, MD5 value calculation by the client.
2, because Gridfs in the process of uploading files is to save the file data to Fs.chunks, Finally, the file to save the information to Fs.files, so if the upload file process failure, it is possible to appear in the Fs.chunks garbage data, the garbage data can be regularly cleaned out.