MongoDB Series Tutorials (eight): Gridfs storage Details _mongodb

Source: Internet
Author: User
Tags md5 mongodb unique id

Gridfs Introduction

MongoDB documents are stored in the Bson format and support binary data types when we save data in binary format directly to MongoDB documents. But when files are too large, such as pictures and videos, the length of each document is limited, so MongoDB provides a canonical--gridfs for handling large files.

GRIDFS Realization Principle

In the Gridfs database, the default is to use Fs.chunks and fs.files to store the files, where the Fs.files collection holds the file information, fs.chunks the data that holds the file, One of the records in a Fs.files collection is as follows: A file message is shown below.

Copy Code code as follows:

{
' _id ': ObjectId ("4f4608844f9b855c6c35e298"),//unique ID, can be a user-defined type
"FileName": "CPU.txt",//filename
"Length": 778,//File length
"Chunksize": 262144,//chunk size
"Uploaddate": Isodate ("2012-02-23t09:36:04.593z"),//upload time
"MD5": "E2C789B036CFB3B848AE39A24E795CA6",//MD5 value of the file
' ContentType ': ' Text/plain '//MIME type of File
"META": null//File other information, default is no "meta" This key, users can define themselves as arbitrary bson objects
}

Corresponds to the chunk (Chinese meaning block) in Fs.chunks, as follows:

Copy Code code as follows:

{
"_id": ObjectId ("4f4608844f9b855c6c35e299"),//chunk ID
"files_id": ObjectId ("4f4608844f9b855c6c35e298"),//ID of file, corresponding to object in Fs.files, foreign key equivalent to Fs.files set
"N": 0,//The first few chunk blocks of the file, if the file is larger than chunksize, it will be split into multiple chunk blocks
"Data": Bindata (0, "QGV ...")//File binary data, here omitted the specific content
}

The default size is 256k, so the file into the GRIDFS process, if the file is greater than chunksize, the file is divided into multiple chunk, and then the chunk saved in the fs.chunks, and finally the file information into the fs.files.

In reading the file, first, according to the conditions of the query, in the Fs.files find a suitable record, get the "_id" value, and then according to the value of the fs.funks to find all files_id _id, and in accordance with "n" sort, and finally read the chunk in sequence " The contents of the data object and revert to the original file.

Note:
     1, Gridfs does not automatically handle MD5 the same file, for MD5 the same file, if you want to in Gridfs only one storage, to user processing, MD5 value calculation by the client.
     2, because Gridfs in the process of uploading files is to save the file data to Fs.chunks, Finally, the file to save the information to Fs.files, so if the upload file process failure, it is possible to appear in the Fs.chunks garbage data, the garbage data can be regularly cleaned out.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.