use MongoDB as a pure memory database (Redis style)
Basic Ideas
The use of MongoDB as a memory database (In-memory db), which is not allowing MongoDB to save data to disk, has aroused more and more people's interest. This usage is very practical for the following applications: write-intensive cache embedded systems that precede a slow RDBMS system require a lightweight database and a unit test that data in the library can easily be purged without the need for a persistent data-compliant PCI system (testing)
If all this could be done, it would be elegant: we would be able to skillfully exploit the MONGODB query/retrieval capabilities without involving disk operations. As you may know, in 99% of the cases, disk IO (especially random io) is the bottleneck of the system, and disk operations are unavoidable if you want to write data.
MongoDB has a very cool design decision that she can use a memory innuendo file (memory-mapped file) to process read and write requests for data in a disk file. This means that MongoDB does not discriminate between RAM and disk, but treats the file as a huge array and then accesses the data in bytes, leaving the operating system (OS) to handle the rest. This is the design decision that allows MongoDB to run in RAM without any modification. Implementation Methods
All this is done by using a special type of file system called Tmpfs . It looks like a regular file system (FS) in Linux, except it's completely in RAM (unless it's larger than RAM, and it can be swap at this point, which is useful.) )。 My server has 32GB of RAM, let's create a 16GB TMPFS:
# mkdir/ramdata
# mount-t Tmpfs-o size=16000m tmpfs/ramdata/
# df
filesystem 1k-blocks Used-Avail Able use% mounted on
/dev/xvde1 5905712 4973924 871792 86% /
None 15344936 0 15344936 0%/dev/shm
tmpfs 16384000 0 16384000 0%/ramdata
The next step is to start the MongoDB with the appropriate settings. To reduce the amount of wasted RAM, you should set the Smallfiles and Noprealloc to True. Since it is based on RAM, doing so does not degrade performance at all. It makes no sense to use journal at this point, so you should set the nojournal to true.
Dbpath=/ramdata
nojournal = True
Smallfiles = True
Noprealloc = True
After MongoDB starts, you will find that she is running very well and the files in the file system appear as expected:
# MONGO
MongoDB shell version:2.3.2
connecting to:test
> Db.test.insert ({a:1})
> Db.test.find (
{"_id": ObjectId ("51802115eafa5d80b5d2c145"), "a": 1}
# ls-l/ramdata/total
65684
-rw-------. 1 ro OT root 16777216 Apr 15:52 local.0
-rw-------. 1 root 16777216 Apr 15:52 local.ns-rwxr-xr-x
. 1 root Ro OT 5 Apr 15:52 mongod.lock
-rw-------. 1 root 16777216 Apr 15:52 test.0-rw
-------. 1 root root 16777216 Apr 15:52 test.ns
drwxr-xr-x. 2 root Apr 15:52 _tmp
Now let's add some data to verify that it's running perfectly normal. We first create a 1KB document and then add it to the MongoDB 4 million times:
> str = ""
> AAA = "aaaaaaaaaa"
aaaaaaaaaa
> for (var i = 0; I < 100; ++i) {str + AAA;}
> for (var i = 0; i < 4000000 ++i) {Db.foo.insert ({a:math.random (), s:str});
> db.foo.stats ()
{
"ns": "Test.foo",
"Count": 4000000,
"size": 4544000160,
"avgobjsize": 1136.00004,
"storagesize": 5030768544,
"numextents": num,
"nindexes": 1,
"lastextentsize": 536600560,
"Paddingfactor": 1,
"systemflags": 1,
"UserFlags": 0,
"totalindexsize": 129794000,< c18/> "indexsizes": {
"_id_": 129794000
},
"OK": 1
}
As you can see, the average document size is 1136 bytes, and the data occupies a total of 5GB of space. The index size above the _id is 130MB. Now we need to verify one
very Important thing: there is no duplication of data in RAM, it is not stored in the MongoDB and file system. Remember that MongoDB does not cache any data within her own process, and her data is cached only in the file system. Let's clear the file system cache and see what else is in RAM:
# echo 3 >/proc/sys/vm/drop_caches
# free
total used free shared buffers Cached
Mem: 30689876 6292780 24397096 0 1044 5817368 -/+ buffers/cache: 474368 30215508
Swap: 0 0 0
As you can see, in the 6.3GB of RAM used, 5.8GB is used for file system caching (buffers, buffer). Why is there still 5.8GB of file system cache in the system even after all caches are cleared ... The reason is that Linux is so smart that she doesn't keep duplicate data in the TMPFS and cache. That's great. This means that you have only one piece of data in RAM. Let's take a look at all the document and verify that RAM usage does not change:
> Db.foo.find () itcount () 4000000 # free total used free shared buffers cached Mem : 30689876 6327988 24361888 0 1324 5818012-/+ buffers/cache:508652 30181224 Swap: 0 0 0 # ls-l/ramdata/total 5808780-rw-------. 1 root 16777216 Apr 15:52 local.0-rw-------. 1 root root 16777216 Apr 15:52 local.ns-rwxr-xr-x. 1 root 5 Apr 15:52 mongod.lock-rw-------. 1 root 16777216 Apr 16:00 test.0-rw-------. 1 root 33554432 Apr 16:00 test.1-rw-------. 1 root 536608768 Apr 16:02 test.10-rw-------. 1 root 536608768 Apr 16:03 test.11-rw-------. 1 root 536608768 Apr 16:03 test.12-rw-------. 1 root 536608768 Apr 16:04 test.13-rw-------. 1 root 536608768 Apr 16:04 test.14-rw-------. 1 root 67108864 Apr 16:00 test.2-rw-------. 1 root 134217728 Apr 16:00 test.3-rw-------. 1 root root 268435456 APR 16:00 TEST.4-RW-------. 1 root 536608768 Apr 16:01 test.5-rw-------. 1 root 536608768 Apr 16:01 test.6-rw-------. 1 root 536608768 Apr 16:04 test.7-rw-------. 1 root 536608768 Apr 16:03 test.8-rw-------. 1 root 536608768 Apr 16:02 test.9-rw-------. 1 root root 16777216 Apr 15:52 test.ns drwxr-xr-x. 2 root Apr 16:04 _tmp # DF filesystem 1k-blocks Used Available use% mounted-on/dev/xvde1 5905712 4973960 871756 86%/None 15344936 0 15344936 0%/dev/shm tmpfs
16384000 5808780 10575220 36%/ramdata
Sure :) copy (replication).
Now that the data in RAM is lost when the server restarts, you may want to use replication. Automatic failover (failover) can be achieved with a standard replica set (replica set), as well as improved data reading (read capacity). If a server reboots, it can then reconstruct its own data (resynchronization, resync) by reading data from another server in the same replica set. Even in the case of large amounts of data and indexes, this process can be fast enough, because indexing is done in ram:)
It is important that writes write to a special collection called Oplog, which is located in the local database. By default, its size is 5% of the total amount of data. In my case, the Oplog will occupy 16GB of 5%, or 800MB of space. In doubt, it is safer to choose a fixed size for oplog using the oplogsize option. If the alternate server goes down longer than the oplog capacity, it must be synchronized. To set its size to 1GB, you can do this:
oplogsize = 1000
fragmentation (sharding).
Now that you have all the query capabilities of MongoDB, use it to achieve a large service. You can use fragmentation to implement a large, scalable memory database. Configure the server (save block allocations) or use a disk-based scheme, because the number of activities on these servers is small, it's not fun to rebuild the cluster from scratch. Attention Matters
RAM is a scarce resource, and in this case you want to have the entire dataset in RAM. Although TMPFS has the ability to use disk swapping (swapping), its performance degradation will be significant. To make the most of RAM, you should consider: Use the usepowerof2sizes option to normalize the storage bucket run the compact command periodically or resynchronize the node (resync) Schemas are designed to be fairly normalized (to avoid a large number of larger document) conclusions
Baby, you can now use MongoDB as a memory database, and you'll be able to work with all of her features. Performance, it should be quite amazing: I test in single thread/core, can reach the speed of 20K write per second, and increase the number of cores will increase the number of times the write speed.
This article address: Http://www.oschina.net/translate/how-to-use-mongodb-as-a-pure-in-memory-db-redis-style
Original address: Http://edgystuff.tumblr.com/post/49304254688/how-to-use-mongodb-as-a-pure-in-memory-db-redis-style