This method is extremely useful for the following applications:
Write-intensive high-speed cache placed before a slow RDBMS System
Embedded System
PCI compatible systems without persistent data
Unit testing, which requires a lightweight database and can easily clear data in the database)
If all this can be achieved, it is really elegant: We can skillfully use the MongoDB query/retrieval function without involving disk operations. You may also know that in 99% cases, disk I/O, especially random I/O, is the bottleneck of the system, and disk operations cannot be avoided if you want to write data.
MongoDB has a very cool design decision, that is, it can use the memory shadow file memory-mapped file) to process read/write requests for data in disk files. That is to say, MongoDB does not treat RAM and disk differently. It only regards files as a huge array and then accesses the data in bytes, the rest are handled by the operating system OS! This design decision allows MongoDB to run in RAM without any modifications.
Implementation Method
All of this is achieved by using a special type of file system called tmpfs. In Linux, it looks the same as the conventional File System FS, but it is completely in RAM, unless its size exceeds the RAM size, it can also perform swap, this is very useful !). My server has 32 gb ram. Let's create a 16 GB tmpfs:
- # mkdir /ramdata
- # mount -t tmpfs -o size=16000M tmpfs /ramdata/
- # df
- Filesystem 1K-blocks Used Available Use% Mounted on
- /dev/xvde1 5905712 4973924 871792 86% /
- none 15344936 0 15344936 0% /dev/shm
- tmpfs 16384000 0 16384000 0% /ramdata
Next, use the appropriate settings to start MongoDB. To reduce the number of wasted RAM resources, set smallfiles and noprealloc to true. Since it is now based on RAM, this will not reduce performance at all. Using journal is meaningless, so set nojournal to true.
- dbpath=/ramdata
- nojournal = true
- smallFiles = true
- noprealloc = true
After MongoDB is started, you will find that it runs very well and the files in the file system appear as expected:
- # mongo
- MongoDB shell version: 2.3.2
- connecting to: test
- > db.test.insert({a:1})
- > db.test.find()
- { "_id" : ObjectId("51802115eafa5d80b5d2c145"), "a" : 1 }
- # ls -l /ramdata/
- total 65684
- -rw-------. 1 root root 16777216 Apr 30 15:52 local.0
- -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns
- -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock
- -rw-------. 1 root root 16777216 Apr 30 15:52 test.0
- -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns
- drwxr-xr-x. 2 root root 40 Apr 30 15:52 _tmp
Now let's add some data to verify that it runs completely normally. Create a 1 kb document and add it to MongoDB for 4 million times:
- > str = ""
- > aaa = "aaaaaaaaaa"
- aaaaaaaaaa
- > for (var i = 0; i < 100; ++i) { str += aaa; }
- > for (var i = 0; i < 4000000; ++i) { db.foo.insert({a: Math.random(), s: str});}
- > db.foo.stats()
- {
- "ns" : "test.foo",
- "count" : 4000000,
- "size" : 4544000160,
- "avgObjSize" : 1136.00004,
- "storageSize" : 5030768544,
- "numExtents" : 26,
- "nindexes" : 1,
- "lastExtentSize" : 536600560,
- "paddingFactor" : 1,
- "systemFlags" : 1,
- "userFlags" : 0,
- "totalIndexSize" : 129794000,
- "indexSizes" : {
- "_id_" : 129794000
- },
- "ok" : 1
- }
It can be seen that the average size of the document is 1136 bytes, and the data occupies a total space of 5 GB. The index size above _ id is 130 MB. Now, we need to verify that there is no duplication of data in RAM. Is there a copy of data stored in MongoDB and the file system? I still remember that MongoDB does not cache any data in her own process, and her data will only be cached in the file system cache. Let's clear the file system cache and see what data is in RAM:
- # echo 3 > /proc/sys/vm/drop_caches
- # free
- total used free shared buffers cached
- Mem: 30689876 6292780 24397096 0 1044 5817368
- -/+ buffers/cache: 474368 30215508
- Swap: 0 0 0
As you can see, in the used gb ram, GB is used for the file system cache buffer, buffer ). Why is there a GB file system cache in the system even after all the caches are cleared ?? The reason is that Linux is so smart that she does not store duplicate data in tmpfs and cache. Great! This means that you only have one copy of data in RAM. Next, let's access all the documents and verify that RAM usage does not change:
- > db.foo.find().itcount()
- 4000000
- # free
- total used free shared buffers cached
- Mem: 30689876 6327988 24361888 0 1324 5818012
- -/+ buffers/cache: 508652 30181224
- Swap: 0 0 0
- # ls -l /ramdata/
- total 5808780
- -rw-------. 1 root root 16777216 Apr 30 15:52 local.0
- -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns
- -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock
- -rw-------. 1 root root 16777216 Apr 30 16:00 test.0
- -rw-------. 1 root root 33554432 Apr 30 16:00 test.1
- -rw-------. 1 root root 536608768 Apr 30 16:02 test.10
- -rw-------. 1 root root 536608768 Apr 30 16:03 test.11
- -rw-------. 1 root root 536608768 Apr 30 16:03 test.12
- -rw-------. 1 root root 536608768 Apr 30 16:04 test.13
- -rw-------. 1 root root 536608768 Apr 30 16:04 test.14
- -rw-------. 1 root root 67108864 Apr 30 16:00 test.2
- -rw-------. 1 root root 134217728 Apr 30 16:00 test.3
- -rw-------. 1 root root 268435456 Apr 30 16:00 test.4
- -rw-------. 1 root root 536608768 Apr 30 16:01 test.5
- -rw-------. 1 root root 536608768 Apr 30 16:01 test.6
- -rw-------. 1 root root 536608768 Apr 30 16:04 test.7
- -rw-------. 1 root root 536608768 Apr 30 16:03 test.8
- -rw-------. 1 root root 536608768 Apr 30 16:02 test.9
- -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns
- drwxr-xr-x. 2 root root 40 Apr 30 16:04 _tmp
- # df
- Filesystem 1K-blocks Used Available Use% Mounted on
- /dev/xvde1 5905712 4973960 871756 86% /
- none 15344936 0 15344936 0% /dev/shm
- tmpfs 16384000 5808780 10575220 36% /ramdata
Sure enough! :)
Replication?
Since data in RAM is lost when the server is restarted, you may want to use replication. The standard replica set can be used to obtain the automatic failover) and improve the data read capability ). If a server restarts, it can read data from another server in the same replica set to re-synchronize and resync its data ). This process is fast enough even when a large amount of data and indexes are involved, because index operations are performed in RAM :)
It is very important that write operations write a special collection called oplog, which is located in the local database. By default, it is 5% of the total data size. In this case, oplog occupies 5% of 16 GB, that is, Mb. If you are not sure, you can use the oplogSize option to select a fixed size for oplog. If the downtime of the alternative server exceeds the oplog capacity, it must be re-synchronized. To set its size to 1 GB, you can:
OplogSize = 1000
Sharding?
Since all the query functions of MongoDB are available, how can we use it to implement a large service? You can use shards as you like to implement a large Scalable Memory Database. The disk-based solution is still used because the number of activities on these servers is small. It is not fun to re-build a cluster from scratch.
Notes
RAM is a scarce resource. In this case, you must make sure that the entire dataset can be stored in RAM. Although tmpfs has the ability to switch swapping by disk, its performance will decline significantly. To make full use of RAM, consider the following:
Use usePowerOf2Sizes to normalize storage buckets
Regularly run the compact command or resync the node)
Schema design should be quite standardized to avoid the emergence of a large number of documents)
Conclusion
Baby, now you can use MongoDB as a memory database and use all of her functions! Performance should be quite amazing: I can test it with a single thread/core, which can speed up to 20 k writes per second, the number of cores increases the write speed by multiple times.
1. Introduction to MeayunDB embedded high-speed Memory Database