MongoDB Gridfs Best Practice Overview

Source: Internet
Author: User
Tags findone

MongoDB Gridfs Best Practice Overview

CHSZS, reprint need to indicate. Blog home: Http://blog.csdn.net/chszs

Gridfs is a simple file system abstraction on top of a MongoDB database. If you are familiar with Amazon S3, then Gridfs is similar. Why would a NoSQL database like MongoDB provide such a file-layer abstraction?

First, the reasons for using Gridfs

The reasons are as follows:

1) Store user-generated file content Most Web apps allow users to upload files. When a user uses a relational database, the files generated by these users are stored in the file system, isolated from the database, and not placed in the database. This brings up some problems. How do I copy a file to all the servers that need it? When a file is deleted, how do I delete all copies? How can the security of documents and the preparation of disasters be ensured? Gridfs solves these problems well and you can use your database backup to back up your files. And thanks to MongoDB's own replication technology, you have copies of your files in every copy of the MongoDB cluster. Deleting files is as simple as deleting objects in the database.

2) access to the contents of the file when the file is uploaded to Gridfs, the file is divided into 256KB blocks and stored separately. So when you need to read a range of bytes in a file, simply load the corresponding file block into memory without having to load the entire file into memory. This is useful when you choose to read or edit a large-sized media content file.

3) in MongoDB store more than 16MB files mongodb default file size limit is 16MB. So, if your files exceed 16MB, then you should use Gridfs.

4) Overcoming File system limitations if you need to store a large number of files, you need to consider the limitations of the file system itself, because the file system is required for the number of files in the directory. With Gridfs, you don't have to worry about this anymore. Gridfs and MongoDB shards allow your files to be distributed across multiple servers without increasing the complexity of the operation.

Second, in-depth gridfs

Gridfs uses two sets of collection to store data

[JavaScript]View Plaincopyprint?
    1. > Show Collections;
    2. Fs.chunks
    3. Fs.files
    4. System.indexes
    5. >
> Show collections;fs.chunksfs.filessystem.indexes>

The Fs.files collection contains the metadata for the file, while the Fs.chunks collection stores the actual block of files that are split in 256KB size. If you have a collection of shards, the file blocks will be distributed across multiple servers, perhaps with better performance than the file system.

[JavaScript]View Plaincopyprint?
    1. > db.fs.files.findone ();   
    2. {  
    3. "_id"  : objectid ( "530cf1bf96038f5cb6df5f39"),   
    4. "filename"  : 
    5. " ChunkSize " : 262144,   
    6. "Uploaddate"  : isodate ( " 2014-02-25t19:40:47.321z "),   
    7. " MD5 " :   "6515E95F8BB161F6435B130A0E587CCD",   
    8. "Length"  : 1644981  
    9. }  
    10. >  
> Db.fs.files.findOne (); {"_id": ObjectId ("530cf1bf96038f5cb6df5f39"), "filename": "./conn.log", "chunkSize": 262144, "uploaddate": Isodate (" 2014-02-25t19:40:47.321z ")," MD5 ":" 6515E95F8BB161F6435B130A0E587CCD "," Length ": 1644981}>

MongoDB also creates composite indexes in the number of files_id and file blocks to help quickly access these file blocks

[JavaScript]View Plaincopyprint?
  1. > db.fs.chunks.getIndexes ();
  2. [
  3. {
  4. "V": 1,
  5. "Key": {
  6. "_ID": 1
  7. },
  8. "NS": "Files.fs.chunks",
  9. "Name": "_id_"
  10. },
  11. {
  12. "V": 1,
  13. "Key": {
  14. "files_id": 1,
  15. "N": 1
  16. },
  17. "NS": "Files.fs.chunks",
  18. "Name": "files_id_1_n_1"
  19. }
  20. ]
  21. >
> db.fs.chunks.getIndexes (); [{"V": 1, "key": {"_id": 1}, "ns": "Files.fs.chunks", "name": "_id_"},{"V": 1, "key": {"files_id": 1, "n": 1}, "ns": " Files.fs.chunks "," name ":" Files_id_1_n_1 "}]>

Iii. Examples of Gridfs

MongoDB has a built-in tool mongofiles that can help practice the actual use of gridfs scenes. See the related driver documentation to see how to use Gridfs.

[JavaScript]View Plaincopyprint?
  1. Put
  2. #mongofiles-H-u-p--db files Put/conn.log
  3. Connected to:127.0.0.1
  4. Added file: {_id:objectid (' 530cf1009710ca8fd47d7d5d '), FileName: "./conn.log", chunksize:262144, Uploaddate: new Date (1393357057021), MD5: "6515E95F8BB161F6435B130A0E587CCD", length:1644981}
  5. done!
  6. Get
  7. #mongofiles-H-u-p--db files Get/conn.log
  8. Connected to:127.0.0.1
  9. Done write to:./conn.log
  10. List
  11. # mongofiles-h-u-p List
  12. Connected to:127.0.0.1
  13. /conn.log 1644981
  14. Delete
  15. [Email protected] tmp]# mongofiles-h-u-p--db files delete/conn.log
  16. Connected to:127.0.0.1
  17. done!
Put#mongofiles-h  -u  -P--db files put/conn.logconnected to:127.0.0.1added file: {_id:objectid (' 530cf1009710ca8fd47d7d5d '), FileName: "./conn.log", chunksize:262144, Uploaddate:new Date (1393357057021), MD5: " 6515E95F8BB161F6435B130A0E587CCD ", length:1644981}done! Get#mongofiles-h  -u  -P--db files get/conn.logconnected to:127.0.0.1done write to:./conn.loglist# Mongofiles-h-  U-  P  listconnected to:127.0.0.1/conn.log 1644981delete[[email protected] tmp]# mongofiles -H-  U-  p  --db files delete/conn.logconnected to:127.0.0.1done!

Iv. Modules for Gridfs

If you want to gridfs files stored in MongoDB directly to the Web server or file system, then you can use the following Gridfs plug-in: 1) gridfs-fuse: Let Gridfs files directly serve the file system 2) Gridfs-nginx: Let Gridfs's files directly serve Nginx

V. Limitations of GRIDFS

Gridfs is not perfect, it also has some limitations: 1 Working Set the Gridfs file that accompanies the database content will significantly stir the MongoDB memory working set. If you don't want Gridfs files to affect your memory working set, you can store Gridfs files on different MongoDB servers. 2) Performance file service performance is slower than providing local file service performance from a Web server or file system. But the loss of performance in exchange for management advantage. 3) Atomic Update Gridfs does not provide a way to update the file atomically. If you need to meet this requirement, then you need to maintain multiple versions of the file and choose the correct version.

http://blog.csdn.net/chszs/article/details/20123327

MongoDB Gridfs Best Practice Overview

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.