statement: The article is mainly from the "MongoDB Combat" a book, the main want to learn from the book MongoDB Knowledge, deepen understanding, so write in their own blog, as a record, in the final chapter, There will be a Web application that sets up the Java EE application of the MongoDB database .
1. Introduction
Gridfs is a file specification that stores large files in a MongoDB database, and all officially supported drivers implement the GRIDFS specification.
1.1, why to use Girdfs
Due to the limited size of the Bson object in MongoDB, all GRIDFS specifications provide a transparent mechanism for splitting a large file into smaller documents, allowing us to effectively save large file objects, especially for large files such as video, high-definition images, etc.
1.2. How to realize mass storage
To achieve this, the specification sets a standard for chunking files, each of which holds a metadata object in the file collection object, and one or more chunk block objects can be combined in a chunk block collection, and in most cases you do not need to know the specifics of this specification. Instead, focus on the parts of the Gridfs API or how to use the Mongofiles tool in each language version of the driver.
1.3. Language support
Gridfs supports Java, Perl, PHP, Python, Ruby, and other programming languages, and provides a good API interface .
1.4. Brief introduction
Gridfs uses two tables to store data:
- Files contains Metadata objects
- Chunks a binary block that contains some other relevant information.
in order for multiple Gridfs to be named a single database, both the file and the block have a prefix, by default, the prefix is FS, so any default Gridfs storage will include the namespace Fs.files and Fs.chunks. The drivers for various third-party languages have permission to change this prefix, so you can try to set another Gridfs namespace for storing photos, which are in the exact location: Photos.files and Photos.chunks, let's look at the actual examples below.
1.5. Command line tools
Mongofiles is a tool for manipulating Gridfs from the command line, such as our/usr/local/xuz/test.html file in the library, as follows:
[Email protected] bin]#/mongofiles put/usr/local/xuz/test.html
Connected to:127.0.0.1
added file: {_id:objectid (' 54a8d33846d47e7bbe9a847a '), FileName: "/usr/local/xuz/test.html", chunksize:261120, Uploaddate:new Date (1420350265089), MD5: "aead353cb437d4d29d61f05bb548b191", length:31}
done!
Let's look at what Gridfs files are in the library and add a list of parameters after mongofiles.
[[email protected] bin]#./mongofiles List
Connected to:127.0.0.1
/usr/local/xuz/test.html 31
Then we'll go into the library and see if there's anything new.
[email protected] bin]#./mongo
MongoDB Shell version:2.6.6
Connecting To:test
> Show Collections
C1
C2
C3
C4
fs.chunks----The fs.chunks mentioned above
fs.files----The fs.files mentioned above
system.indexes
System.js
Xuz
we continue to see what's in Fs.files
> Db.fs.files.find ();
{"_id": ObjectId ("54a8d33846d47e7bbe9a847a"), "filename": "/usr/local/xuz/test.html", "chunkSize": 261120, " Uploaddate ": Isodate (" 2015-01-04t05:44:25.089z ")," MD5 ":"aead353cb437d4d29d61f05bb548b191"," Length ": 31}
Field Description:
- FileName: the stored file name
- Chunksize:chunks the size of the tile
- Uploaddate: Time of storage
- MD5: MD5 code for this file
- Length: File size, per byte
It appears that some of the underlying meta-data information is stored in fs.files.
we continue to see what's in Fs.chunks
> Db.fs.chunks.find ();
{"_id": ObjectId ("54a8d339deaed25af579df57"), "files_id": ObjectId ("54a8d33846d47e7bbe9a847a"), "n": 0, "Data": Bindata (0, "c2rzzhnkcnnkc2rzzapzzhnkcwpzzhnkcwpzzhnkcg==")}
one of the more important fields is N, which represents the ordinal number of the chunks, starting from 0, it appears that the fs.chunks is storing some actual content data information.
Now that we can put this file in, we should have a way to get it out, and we'll take that file out.
[email protected] bin]# Cd/usr/local/xuz
[email protected] xuz]# ls-l
Total 4
-rw-r--r--. 1 root root 4 13:43 test.html
[[email protected] xuz]# RM-RF test.html--delete file
[email protected] xuz]# LL
Total 0
[email protected] bin]#/mongofiles get/usr/local/xuz/test.html
Connected to:127.0.0.1
Done write to:/usr/local/xuz/test.html
[[email protected] bin]# md5sum/usr/local/xuz/test.html--Test MD5, the results are the same as in the library
aead353cb437d4d29d61f05bb548b191/usr/local/xuz/test.html
finally check to see if the file was removed.
[email protected] bin]# Cd/usr/local/xuz
[email protected] xuz]# LL
Total 4
-rw-r--r--. 1 root root 4 13:57 test.html--Successfully removed file
1.6. Index
db.fs.chunks.ensureIndex ({files_id:1,n:1}), {unique:true}
Thus, a block can be retrieved using its files_id and n values, noting that GRIDFS can still get the first block with FindOne, as follows:Db.fs.chunks.findOne ({files_id:myfileid,n:0});
Part Two application chapter sixth MongoDB Gridfs