MongoDB Gridfs Best Practice Overview

Last Update:2015-02-07 Source: Internet

Author: User

Tags findone

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MongoDB Gridfs Best Practice Overview

CHSZS, reprint need to indicate. Blog home: Http://blog.csdn.net/chszs

Gridfs is a simple file system abstraction on top of a MongoDB database. If you are familiar with Amazon S3, then Gridfs is similar. Why would a NoSQL database like MongoDB provide such a file-layer abstraction?

First, the reasons for using Gridfs

The reasons are as follows:

1) Store user-generated file content Most Web apps allow users to upload files. When a user uses a relational database, the files generated by these users are stored in the file system, isolated from the database, and not placed in the database. This brings up some problems. How do I copy a file to all the servers that need it? When a file is deleted, how do I delete all copies? How can the security of documents and the preparation of disasters be ensured? Gridfs solves these problems well and you can use your database backup to back up your files. And thanks to MongoDB's own replication technology, you have copies of your files in every copy of the MongoDB cluster. Deleting files is as simple as deleting objects in the database.

2) access to the contents of the file when the file is uploaded to Gridfs, the file is divided into 256KB blocks and stored separately. So when you need to read a range of bytes in a file, simply load the corresponding file block into memory without having to load the entire file into memory. This is useful when you choose to read or edit a large-sized media content file.

3) in MongoDB store more than 16MB files mongodb default file size limit is 16MB. So, if your files exceed 16MB, then you should use Gridfs.

4) Overcoming File system limitations if you need to store a large number of files, you need to consider the limitations of the file system itself, because the file system is required for the number of files in the directory. With Gridfs, you don't have to worry about this anymore. Gridfs and MongoDB shards allow your files to be distributed across multiple servers without increasing the complexity of the operation.

Second, in-depth gridfs

Gridfs uses two sets of collection to store data

[JavaScript]View Plaincopyprint?

> Show Collections;
Fs.chunks
Fs.files
System.indexes
>

> Show collections;fs.chunksfs.filessystem.indexes>

The Fs.files collection contains the metadata for the file, while the Fs.chunks collection stores the actual block of files that are split in 256KB size. If you have a collection of shards, the file blocks will be distributed across multiple servers, perhaps with better performance than the file system.

[JavaScript]View Plaincopyprint?

> db.fs.files.findone ();
{
"_id" : objectid ( "530cf1bf96038f5cb6df5f39"),
"filename" :
" ChunkSize " : 262144,
"Uploaddate" : isodate ( " 2014-02-25t19:40:47.321z "),
" MD5 "&NBSP;: "6515E95F8BB161F6435B130A0E587CCD",
"Length" &NBSP;:&NBSP;1644981&NBSP;&NBSP;
}&NBSP;&NBSP;
>

> Db.fs.files.findOne (); {"_id": ObjectId ("530cf1bf96038f5cb6df5f39"), "filename": "./conn.log", "chunkSize": 262144, "uploaddate": Isodate (" 2014-02-25t19:40:47.321z ")," MD5 ":" 6515E95F8BB161F6435B130A0E587CCD "," Length ": 1644981}>

MongoDB also creates composite indexes in the number of files_id and file blocks to help quickly access these file blocks

[JavaScript]View Plaincopyprint?

> db.fs.chunks.getIndexes ();
[
{
"V": 1,
"Key": {
"_ID": 1
},
"NS": "Files.fs.chunks",
"Name": "_id_"
},
{
"V": 1,
"Key": {
"files_id": 1,
"N": 1
},
"NS": "Files.fs.chunks",
"Name": "files_id_1_n_1"
}
]
>

> db.fs.chunks.getIndexes (); [{"V": 1, "key": {"_id": 1}, "ns": "Files.fs.chunks", "name": "_id_"},{"V": 1, "key": {"files_id": 1, "n": 1}, "ns": " Files.fs.chunks "," name ":" Files_id_1_n_1 "}]>

Iii. Examples of Gridfs

MongoDB has a built-in tool mongofiles that can help practice the actual use of gridfs scenes. See the related driver documentation to see how to use Gridfs.

[JavaScript]View Plaincopyprint?

Put
#mongofiles-H-u-p--db files Put/conn.log
Connected to:127.0.0.1
Added file: {_id:objectid (' 530cf1009710ca8fd47d7d5d '), FileName: "./conn.log", chunksize:262144, Uploaddate: new Date (1393357057021), MD5: "6515E95F8BB161F6435B130A0E587CCD", length:1644981}
done!
Get
#mongofiles-H-u-p--db files Get/conn.log
Connected to:127.0.0.1
Done write to:./conn.log
List
# mongofiles-h-u-p List
Connected to:127.0.0.1
/conn.log 1644981
Delete
[Email protected] tmp]# mongofiles-h-u-p--db files delete/conn.log
Connected to:127.0.0.1
done!

Put#mongofiles-h  -u  -P--db files put/conn.logconnected to:127.0.0.1added file: {_id:objectid (' 530cf1009710ca8fd47d7d5d '), FileName: "./conn.log", chunksize:262144, Uploaddate:new Date (1393357057021), MD5: " 6515E95F8BB161F6435B130A0E587CCD ", length:1644981}done! Get#mongofiles-h  -u  -P--db files get/conn.logconnected to:127.0.0.1done write to:./conn.loglist# Mongofiles-h-  U-  P  listconnected to:127.0.0.1/conn.log 1644981delete[[email protected] tmp]# mongofiles -H-  U-  p  --db files delete/conn.logconnected to:127.0.0.1done!

Iv. Modules for Gridfs

If you want to gridfs files stored in MongoDB directly to the Web server or file system, then you can use the following Gridfs plug-in: 1) gridfs-fuse: Let Gridfs files directly serve the file system 2) Gridfs-nginx: Let Gridfs's files directly serve Nginx

V. Limitations of GRIDFS

Gridfs is not perfect, it also has some limitations: 1 Working Set the Gridfs file that accompanies the database content will significantly stir the MongoDB memory working set. If you don't want Gridfs files to affect your memory working set, you can store Gridfs files on different MongoDB servers. 2) Performance file service performance is slower than providing local file service performance from a Web server or file system. But the loss of performance in exchange for management advantage. 3) Atomic Update Gridfs does not provide a way to update the file atomically. If you need to meet this requirement, then you need to maintain multiple versions of the file and choose the correct version.

http://blog.csdn.net/chszs/article/details/20123327

MongoDB Gridfs Best Practice Overview

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More