Use MongoDB as a pure memory database (Redis style) __ Database

Source: Internet
Author: User
Tags failover mongodb mongodb query
use MongoDB as a pure memory database (Redis style) Basic Ideas

The use of MongoDB as a memory database (In-memory db), which is not allowing MongoDB to save data to disk, has aroused more and more people's interest. This usage is very practical for the following applications: write-intensive cache embedded systems that precede a slow RDBMS system require a lightweight database and a unit test that data in the library can easily be purged without the need for a persistent data-compliant PCI system (testing)

If all this could be done, it would be elegant: we would be able to skillfully exploit the MONGODB query/retrieval capabilities without involving disk operations. As you may know, in 99% of the cases, disk IO (especially random io) is the bottleneck of the system, and disk operations are unavoidable if you want to write data.

MongoDB has a very cool design decision that she can use a memory innuendo file (memory-mapped file) to process read and write requests for data in a disk file. This means that MongoDB does not discriminate between RAM and disk, but treats the file as a huge array and then accesses the data in bytes, leaving the operating system (OS) to handle the rest. This is the design decision that allows MongoDB to run in RAM without any modification. Implementation Methods

All this is done by using a special type of file system called Tmpfs . It looks like a regular file system (FS) in Linux, except it's completely in RAM (unless it's larger than RAM, and it can be swap at this point, which is useful.) )。 My server has 32GB of RAM, let's create a 16GB TMPFS:

# mkdir/ramdata
# mount-t Tmpfs-o size=16000m tmpfs/ramdata/
# df
filesystem           1k-blocks      Used-Avail Able use% mounted on
/dev/xvde1             5905712   4973924 871792 86%  /
None                  15344936         0  15344936   0%/dev/shm
tmpfs                 16384000         0  16384000   0%/ramdata

The next step is to start the MongoDB with the appropriate settings. To reduce the amount of wasted RAM, you should set the Smallfiles and Noprealloc to True. Since it is based on RAM, doing so does not degrade performance at all. It makes no sense to use journal at this point, so you should set the nojournal to true.

Dbpath=/ramdata
nojournal = True
Smallfiles = True
Noprealloc = True

After MongoDB starts, you will find that she is running very well and the files in the file system appear as expected:

# MONGO
MongoDB shell version:2.3.2
connecting to:test
> Db.test.insert ({a:1})
> Db.test.find (
{"_id": ObjectId ("51802115eafa5d80b5d2c145"), "a": 1}

# ls-l/ramdata/total
65684
-rw-------. 1 ro OT root 16777216 Apr 15:52 local.0
-rw-------. 1 root 16777216 Apr 15:52 local.ns-rwxr-xr-x
. 1 root Ro OT        5 Apr 15:52 mongod.lock
-rw-------. 1 root 16777216 Apr 15:52 test.0-rw
-------. 1 root root 16777216 Apr 15:52 test.ns
drwxr-xr-x. 2 root       Apr 15:52 _tmp

Now let's add some data to verify that it's running perfectly normal. We first create a 1KB document and then add it to the MongoDB 4 million times:

> str = ""

> AAA = "aaaaaaaaaa"
aaaaaaaaaa
> for (var i = 0; I < 100; ++i) {str + AAA;}

> for (var i = 0; i < 4000000 ++i) {Db.foo.insert ({a:math.random (), s:str});
> db.foo.stats ()
{
        "ns": "Test.foo",
        "Count": 4000000,
        "size": 4544000160,
        "avgobjsize": 1136.00004,
        "storagesize": 5030768544,
        "numextents": num,
        "nindexes": 1,
        "lastextentsize": 536600560,
        "Paddingfactor": 1,
        "systemflags": 1,
        "UserFlags": 0,
        "totalindexsize": 129794000,< c18/> "indexsizes": {
                "_id_": 129794000
        },
        "OK": 1
}
As you can see, the average document size is 1136 bytes, and the data occupies a total of 5GB of space. The index size above the _id is 130MB. Now we need to verify one very Important thing: there is no duplication of data in RAM, it is not stored in the MongoDB and file system. Remember that MongoDB does not cache any data within her own process, and her data is cached only in the file system. Let's clear the file system cache and see what else is in RAM:
# echo 3 >/proc/sys/vm/drop_caches 
# free
             total       used       free     shared    buffers     Cached
Mem:      30689876    6292780   24397096          0       1044    5817368     -/+ buffers/cache: 474368   30215508
Swap:            0          0          0

As you can see, in the 6.3GB of RAM used, 5.8GB is used for file system caching (buffers, buffer). Why is there still 5.8GB of file system cache in the system even after all caches are cleared ... The reason is that Linux is so smart that she doesn't keep duplicate data in the TMPFS and cache. That's great. This means that you have only one piece of data in RAM. Let's take a look at all the document and verify that RAM usage does not change:

> Db.foo.find () itcount () 4000000 # free total used free shared buffers cached Mem            : 30689876 6327988 24361888 0 1324 5818012-/+ buffers/cache:508652 30181224 Swap: 0 0 0 # ls-l/ramdata/total 5808780-rw-------. 1 root 16777216 Apr 15:52 local.0-rw-------. 1 root root 16777216 Apr 15:52 local.ns-rwxr-xr-x. 1 root 5 Apr 15:52 mongod.lock-rw-------. 1 root 16777216 Apr 16:00 test.0-rw-------. 1 root 33554432 Apr 16:00 test.1-rw-------. 1 root 536608768 Apr 16:02 test.10-rw-------. 1 root 536608768 Apr 16:03 test.11-rw-------. 1 root 536608768 Apr 16:03 test.12-rw-------. 1 root 536608768 Apr 16:04 test.13-rw-------. 1 root 536608768 Apr 16:04 test.14-rw-------. 1 root 67108864 Apr 16:00 test.2-rw-------. 1 root 134217728 Apr 16:00 test.3-rw-------. 1 root root 268435456 APR 16:00 TEST.4-RW-------. 1 root 536608768 Apr 16:01 test.5-rw-------. 1 root 536608768 Apr 16:01 test.6-rw-------. 1 root 536608768 Apr 16:04 test.7-rw-------. 1 root 536608768 Apr 16:03 test.8-rw-------. 1 root 536608768 Apr 16:02 test.9-rw-------. 1 root root 16777216 Apr 15:52 test.ns drwxr-xr-x.              2 root Apr 16:04 _tmp # DF filesystem 1k-blocks Used Available use% mounted-on/dev/xvde1                 5905712 4973960 871756 86%/None 15344936 0 15344936 0%/dev/shm tmpfs
 16384000 5808780 10575220 36%/ramdata

Sure :) copy (replication).

Now that the data in RAM is lost when the server restarts, you may want to use replication. Automatic failover (failover) can be achieved with a standard replica set (replica set), as well as improved data reading (read capacity). If a server reboots, it can then reconstruct its own data (resynchronization, resync) by reading data from another server in the same replica set. Even in the case of large amounts of data and indexes, this process can be fast enough, because indexing is done in ram:)

It is important that writes write to a special collection called Oplog, which is located in the local database. By default, its size is 5% of the total amount of data. In my case, the Oplog will occupy 16GB of 5%, or 800MB of space. In doubt, it is safer to choose a fixed size for oplog using the oplogsize option. If the alternate server goes down longer than the oplog capacity, it must be synchronized. To set its size to 1GB, you can do this:

oplogsize = 1000
fragmentation (sharding).

Now that you have all the query capabilities of MongoDB, use it to achieve a large service. You can use fragmentation to implement a large, scalable memory database. Configure the server (save block allocations) or use a disk-based scheme, because the number of activities on these servers is small, it's not fun to rebuild the cluster from scratch. Attention Matters

RAM is a scarce resource, and in this case you want to have the entire dataset in RAM. Although TMPFS has the ability to use disk swapping (swapping), its performance degradation will be significant. To make the most of RAM, you should consider: Use the usepowerof2sizes option to normalize the storage bucket run the compact command periodically or resynchronize the node (resync) Schemas are designed to be fairly normalized (to avoid a large number of larger document) conclusions

Baby, you can now use MongoDB as a memory database, and you'll be able to work with all of her features. Performance, it should be quite amazing: I test in single thread/core, can reach the speed of 20K write per second, and increase the number of cores will increase the number of times the write speed.

This article address: Http://www.oschina.net/translate/how-to-use-mongodb-as-a-pure-in-memory-db-redis-style

Original address: Http://edgystuff.tumblr.com/post/49304254688/how-to-use-mongodb-as-a-pure-in-memory-db-redis-style

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.