Basic ideas
The use of MongoDB as a memory database (In-memory db), which is not allowing MongoDB to save data to disk, has aroused more and more people's interest. This usage is very useful for the following applications:
- Write-intensive caching before a slow RDBMS system
- Embedded system
- PCI compatible systems without persistent data
- Unit tests that require a lightweight database and that data in the library can be easily cleared away (testing)
If all this could be done, it would be elegant: we would be able to skillfully exploit the MONGODB query/retrieval capabilities without involving disk operations. As you may know, in 99% of the cases, disk IO (especially random io) is the bottleneck of the system, and disk operations are unavoidable if you want to write data.
MongoDB has a very cool design decision that she can use a memory innuendo file (memory-mapped file) to process read and write requests for data in a disk file. This means that MongoDB does not discriminate between RAM and disk, but treats the file as a huge array and then accesses the data in bytes, leaving the operating system (OS) to handle it! This is the design decision that allows MongoDB to run in RAM without any modification.
Implementation methods
All this is done by using a special type of file system called TMPFS. In Linux it looks like a regular file system (FS), but it's completely in RAM (unless it's larger than RAM, and it can be swap at this point, which is useful!). )。 My server has 32GB of RAM, let's create a 16GB TMPFS:
Copy Code code as follows:
# Mkdir/ramdata
# mount-t Tmpfs-o size=16000m tmpfs/ramdata/
# DF
FileSystem 1k-blocks Used Available use% mounted on
/dev/xvde1 5905712 4973924 871792 86%/
None 15344936 0 15344936 0%/dev/shm
Tmpfs 16384000 0 16384000 0%/ramdata
The next step is to start the MongoDB with the appropriate settings. To reduce the amount of wasted RAM, you should set the Smallfiles and Noprealloc to True. Since it is based on RAM, doing so does not degrade performance at all. It makes no sense to use journal at this point, so you should set the nojournal to true.
Copy Code code as follows:
Dbpath=/ramdata
Nojournal = True
Smallfiles = True
Noprealloc = True
After MongoDB starts, you will find that she is running very well and the files in the file system appear as expected:
Copy Code code as follows:
# MONGO
MongoDB Shell version:2.3.2
Connecting To:test
> Db.test.insert ({a:1})
> Db.test.find ()
{"_id": ObjectId ("51802115eafa5d80b5d2c145"), "a": 1}
# ls-l/ramdata/
Total 65684
-RW-------. 1 root 16777216 Apr 15:52 local.0
-RW-------. 1 root 16777216 Apr 15:52 local.ns
-rwxr-xr-x. 1 root 5 Apr 15:52 Mongod.lock
-RW-------. 1 root 16777216 Apr 15:52 test.0
-RW-------. 1 root 16777216 Apr 15:52 test.ns
Drwxr-xr-x. 2 root APR 15:52 _tmp
Now let's add some data to verify that it's running perfectly normal. We first create a 1KB document and then add it to the MongoDB 4 million times:
Copy Code code as follows:
> str = ""
> AAA = "aaaaaaaaaa"
Aaaaaaaaaa
> for (var i = 0; i < ++i) {str + AAA;}
> for (var i = 0; i < 4000000 ++i) {Db.foo.insert ({a:math.random (), s:str});
> Db.foo.stats ()
{
"NS": "Test.foo",
"Count": 4000000,
"Size": 4544000160,
"Avgobjsize": 1136.00004,
"Storagesize": 5030768544,
"Numextents": 26,
"Nindexes": 1,
"Lastextentsize": 536600560,
"Paddingfactor": 1,
"Systemflags": 1,
"UserFlags": 0,
"Totalindexsize": 129794000,
"Indexsizes": {
"_id_": 129794000
},
"OK": 1
}
As you can see, the average document size is 1136 bytes, and the data occupies a total of 5GB of space. The index size above the _id is 130MB. Now we need to verify a very important thing: does the data in RAM be duplicated, and is it stored in the MongoDB and file system? Remember that MongoDB does not cache any data within her own process, and her data is cached only in the file system. Let's clear the file system cache and see what else is in RAM:
Copy Code code as follows:
# echo 3 >/proc/sys/vm/drop_caches
# free
Total used free shared buffers Cached
mem:30689876 6292780 24397096 0 1044 5817368
-/+ buffers/cache:474368 30215508
swap:0 0 0
As you can see, in the 6.3GB of RAM used, 5.8GB is used for file system caching (buffers, buffer). Why is there still 5.8GB of file system cache in the system even after all caches have been cleared?? The reason is that Linux is so smart that she doesn't keep duplicate data in the TMPFS and cache. That's great! This means that you have only one piece of data in RAM. Let's take a look at all the document and verify that RAM usage does not change:
Copy Code code as follows:
> Db.foo.find (). Itcount ()
4000000
# free
Total used free shared buffers Cached
mem:30689876 6327988 24361888 0 1324 5818012
-/+ buffers/cache:508652 30181224
swap:0 0 0
# ls-l/ramdata/
Total 5808780
-RW-------. 1 root 16777216 Apr 15:52 local.0
-RW-------. 1 root 16777216 Apr 15:52 local.ns
-rwxr-xr-x. 1 root 5 Apr 15:52 Mongod.lock
-RW-------. 1 root 16777216 Apr 16:00 test.0
-RW-------. 1 root 33554432 Apr 16:00 test.1
-RW-------. 1 root 536608768 Apr 16:02 test.10
-RW-------. 1 root 536608768 Apr 16:03 test.11
-RW-------. 1 root 536608768 Apr 16:03 test.12
-RW-------. 1 root 536608768 Apr 16:04 test.13
-RW-------. 1 root 536608768 Apr 16:04 test.14
-RW-------. 1 root 67108864 Apr 16:00 test.2
-RW-------. 1 root 134217728 Apr 16:00 test.3
-RW-------. 1 root 268435456 Apr 16:00 test.4
-RW-------. 1 root 536608768 Apr 16:01 test.5
-RW-------. 1 root 536608768 Apr 16:01 test.6
-RW-------. 1 root 536608768 Apr 16:04 test.7
-RW-------. 1 root 536608768 Apr 16:03 test.8
-RW-------. 1 root 536608768 Apr 16:02 test.9
-RW-------. 1 root 16777216 Apr 15:52 test.ns
Drwxr-xr-x. 2 root APR 16:04 _tmp
# DF
FileSystem 1k-blocks Used Available use% mounted on
/dev/xvde1 5905712 4973960 871756 86%/
None 15344936 0 15344936 0%/dev/shm
Tmpfs 16384000 5808780 10575220 36%/ramdata
Sure :)
Copy (replication)?
Now that the data in RAM is lost when the server restarts, you may want to use replication. Automatic failover (failover) can be achieved with a standard replica set (replica set), as well as improved data reading (read capacity). If a server reboots, it can then reconstruct its own data (resynchronization, resync) by reading data from another server in the same replica set. Even in the case of large amounts of data and indexes, this process can be fast enough, because indexing is done in ram:)
It is important that writes write to a special collection called Oplog, which is located in the local database. By default, its size is 5% of the total amount of data. In my case, the Oplog will occupy 16GB of 5%, or 800MB of space. In doubt, it is safer to choose a fixed size for oplog using the oplogsize option. If the alternate server goes down longer than the oplog capacity, it must be synchronized. To set its size to 1GB, you can do this:
Copy Code code as follows:
What about fragmentation (sharding)?
Now that you have all the query capabilities of MongoDB, how do you use it to achieve a large service? You can use fragmentation to implement a large, scalable memory database. Configure the server (save block allocations) or use a disk-based scheme, because the number of activities on these servers is small, it's not fun to rebuild the cluster from scratch.
Attention Matters
RAM is a scarce resource, and in this case you want to have the entire dataset in RAM. Although TMPFS has the ability to use disk swapping (swapping), its performance degradation will be significant. In order to take full advantage of RAM, you should consider:
- Normalize storage bucket using the usepowerof2sizes option
- Run the compact command periodically or resynchronize a node (resync)
- Schemas are designed to be fairly normalized (to avoid a large number of larger document)
Conclusion
Baby, you can now use the MongoDB as a memory database, and you can have all of her features! Performance, it should be quite amazing: I test in single thread/core, can reach the speed of 20K write per second, and increase the number of cores will increase the number of times the write speed.