"Go" Test MongoDB's differential chip performance with YCSB

Source: Internet
Author: User
Tags benchmark data structures mongodb driver mongodb server system log

MongoDB's library-level lock

MongoDB is currently the most popular NoSQL database, with its natural document-type data structures, flexible data patterns, and easy-to-use horizontal scaling capabilities that have been favored by many developers. But gold is no can't pure perfect, MongoDB is not without its weaknesses, such as its library-level lock is a performance bottleneck that people often complain about. In a nutshell, MongoDB's library-level locks are all write operations for a database and must be obtained only with a single mutex in the database. This sounds bad, but in fact, because a write operation is only for the moment that the memory data update is reserved, the time taken for each write lock is usually at the nanosecond level. Because of this, library-level locks in real-world applications do not have the same significant impact on performance as people are concerned about.

In a handful of ultra-high concurrent write scenarios, a library-level lock can be a bottleneck. This can be observed using the DB Lock% (or mongostat command line output) indicator in MongoDB's MMS Monitor. In general, if DB Lock is more than 70-80% and continues to be considered saturated. How to solve this problem?

Scenario One: sharding

This is MongoDB's standard answer if you have enough hardware resources. Sharding is the ultimate way to solve most of the performance bottleneck problems.

Scenario Two: Sub-Library

This is a very effective alternative. The practice is to separate your data into several different databases, and then implement a routing switch in the data access layer in the application to ensure that the data read and write is directed to the appropriate database. A better example in a census database, you can build a separate library for each province. 31 databases form a large logical library. But this is not always possible, for example, if you need to do a lot of query ordering of the whole database, then reconciling the results of multiple libraries can become cumbersome or impossible.

Scenario Three: Waiting

MongoDB 2.8 is about to be released. The biggest change in 2.8 is to change the library-level lock to a document-level lock. Performance issues caused by library-level locks should be expected to be significantly improved.

Scenario Four: Micro-sharding

The definition of a differentiator is the use of MongoDB's Shard technology, but multiple or all Shard Mongod run on the same server (the server can be a physical or virtual machine). Because of the existence of library-level locks and the fact that MongoDB is not highly utilized for multicore CPUs, a differential slice can be a good performance tuning tool in scenarios where the following conditions are true:
1) The server has multi-core (4 or 8 or more) CPUs
2) The server has not yet seen an IO bottleneck
3) There is enough memory to load the thermal data (no frequent page faults)

In this article we look at the effect of using the differential slice technology on performance improvement by doing some performance tests.

YCSB Performance Testing Tools

Before I start the test, I'd like to take a moment to introduce the YCSB tool first. The reason is that most of the time I see developers or DBAs doing tests with very simple tools to perform high-concurrency inserts or read tests as clients. MongoDB itself is a high-performance database, and the concurrency can be up to tens of thousands of levels per second in the case of proper tuning. If the client's code is simple and brutal, and even uses a single-threaded client, the bottleneck in performance testing is first in the client itself, not the server. So choosing an efficient client is an important first step in a good performance test.

YCSB is a tool specifically developed by Yahoo for benchmarking new generation databases. Full name is Yahoo! Cloud serving Benchmark. They are developing this tool in the hope that there is a standard tool to measure the performance of different databases. YCSB has done a lot of optimizations to improve client performance, such as using the most primitive array of bits on the data type to reduce the time it takes for the data object itself to create the transformation. Several features of YCSB:

* Support Common database read and write operations, such as inserting, modifying, deleting and reading
* Multithreading support. YCSB is implemented in Java with good multithreading support.

* Flexible definition of scene files. Can be flexibly specified by the parameters of the test scenario, such as 100% Insert, 50% read 50% write, etc.
* Data Request Distribution mode: Support random, Zipfian (only a small portion of the data to get most of the access request) and the latest data several request distribution mode
* Extensibility: can be extended workload to modify or extend the function of YCSB

Installing YCSB

Since the YCSB itself will take a lot of work, it is generally recommended to deploy YCSB on a separate machine, preferably 4-8 cores cpu,8g memory above. YCSB and database servers must be guaranteed at least gigabit bandwidth, preferably million-gigabit.

* Install JDK 1.7
* Download implements the YCSB compiled version of MongoDB driver: Http://pan.baidu.com/s/1o6iFcfS
* Unzip
* Go to the YCSB directory and run (locally to have a MONGO database on port 27017):
./BIN/YCSB Run Mongodb-p Workloads/workloada

* If the YCSB can be run, it indicates the installation is successful

You can also use Git to pull the source file down and compile it yourself. Requires JDK and MAVEN tools. GitHub address is: Https://github.com/achille/YCSB can refer to this page to compile and install Ycsb:https://github.com/achille/ycsb/tree/master/mongodb

YCSB Scene File

Using YCSB to test different scenarios requires only a different scene file. YCSB will automatically generate a response client request based on the properties of your scene file. In our test we will use several scenarios:

Scene s1:100% inserted. Used to load test data
Scenario S2: Write read less 90% update 10% read
Scenario S3: Mixed read/write 65% read, 25% Insert, 10% update
Scene S4: Read and write less than 90% read, 10% Insert, update
Scene s5:100% Read

Here are the contents of the scene file S2:

recordcount=5000000
operationcount=100000000
Workload=com.yahoo.ycsb.workloads.coreworkload

Readallfields=true

readproportion=0.1
updateproportion=0.9
Scanproportion=0
Insertproportion=0

Requestdistribution=uniform

Insertorder=hashed

fieldlength=250
Fieldcount=8

mongodb.url=mongodb://192.168.1.2:27017
Mongodb.writeconcern=acknowledged
Threadcount=32

Some notes:

* Test data consists of 5 million documents (RecordCount)
* Each document size is approximately 2KB (Fieldlength x fieldcount). The total data size is the index of the 10g+600m
* The URL of the MongoDB database is 192.168.1.2:27017
* MongoDB Write security Setting (Mongodb.writeconcern) is acknowledged
* Number of threads is (threadcount)
* Order of inserting documents: Hash/random (InsertOrder)
* Update operation: 90% (0.9)
* Read operation: 10% (0.1)

Click here to download all scene files (S1–S5) and unzip to the YCSB directory created above: Http://pan.baidu.com/s/1voJAA

MongoDB Configuration

This test was tested on AWS ' virtual hosts. The following are the server configuration conditions:

* Os:amazon Linux (basically similar to CentOS)
* Cpu:8 Vcpus
* ram:30g
* storage:160g SSD
* journal:25g EBS with PIOPS
* log:10g EBS with IOPS
*
* mongodb:2.6.0
* READAHEAD:32

A few notes:
MongoDB data, the recovery log (journal), and the system log (log) were used with 3 different storage disks. This is a common optimization method to ensure that the write log operation does not affect the data of the brush disk IO. In addition, the server's readahead settings were changed to the recommended 32. See also my other blog about ReadAhead: http://mongoing.com/tj/linux-tuning

Stand-alone benchmark test

Before we test the use of differential chip performance, we first need to derive the highest performance of a single machine. Start the target MongoDB server, log in and delete the YCSB database (if it already exists)

# MONGO

> Use YCSB
> Db.dropdatabase ()

Scenario S1: Data insertion

Next, start running YCSB. Go to the YCSB directory and run the following command (verify that there is already a scene file in the current directory S1, S2, S3, S4,S5)

./BIN/YCSB Load Mongodb-p s1-s

If it works, you'll see YCSB print the current state every 10 seconds, including the concurrency rate per second and the average response time. Such as:

Loading workload ...
Starting test.
0 sec:0 operations;
MONGO connection created with LOCALHOST:27017/YCSB
Ten sec:67169 operations; 7002.16 current ops/sec; [INSERT averagelatency (US) =4546.87]
sec:151295 operations; 7909.24 current ops/sec; [INSERT averagelatency (US) =3920.9]
sec:223663 operations; 7235.35 current ops/sec; [INSERT averagelatency (US) =4422.63]

While running, you can monitor MongoDB's real-time metrics with mongostat (or better choice: MMS) to see if the report is broadly consistent with YCSB.
After the run is finished, you can see output similar to the following:

[OVERALL], RunTime (ms), 687134.0
[OVERALL], throughput (ops/sec), 7295.168457372555
...
[INSERT], Operations, 5000000
[INSERT], averagelatency (US), 4509.1105768
[INSERT], minlatency (US), 126
[INSERT], maxlatency (US), 3738063
[INSERT], 95thPercentileLatency (ms), 10
[INSERT], 99thPercentileLatency (ms), 37
[INSERT], return=0, 5000000
...

This output tells us to insert 5 million records, which takes 687 seconds, the average concurrency is 7,295 per second, and the average response time is 4.5ms. Note that the value itself does not have any reference value for MongoDB's performance indicators. If your environment is inconsistent at any point, or if the size of the inserted data, or how much of the index is different, will result in a big difference. So this value can only be used as the benchmark for this test and for the micro-shard performance comparison.

In MongoDB, pay special attention to the Mongostat or MMS report page faults,network,db Lock% and other indicators. If your network is 1gb/s and Mongostat reports the 100m number, then your net is basically saturated. The bandwidth of the 1gb/s is also the 128m/s transfer rate. In my test, the network in remains in 14-15m/s, and the concurrency rate per second and the document size (7300X2KB) are consistent.

To find a more ideal number of client threads, I repeated the same operation several times, modifying the threadcount values in the scene file every time. The results of the test found that the maximum number of concurrent 30 threads reached the highest value. Increase the number of threads and the performance no longer increases. Because the ThreadCount value in my scene file is set to 32.

Now that we have 5 million test data in our database, we can now measure a few other scenarios. Note: The first parameter of the YCSB is the test phase. This is the data import, so the first parameter is "load". After importing the data, the next step is run, so the second parameter is "run".

Scene S2: Write more read less

Command:

./BIN/YCSB Run Mongodb-p s2-s

Results
...
[OVERALL], throughput (ops/sec), 12102.2928384723

Scenario S3: Mixed Read and write (65%read)

Command:

./BIN/YCSB Run Mongodb-p s3-s

Results
...
[OVERALL], throughput (ops/sec), 15982.39239483840

Scene S4: Read more and write less

Command:

./BIN/YCSB Run Mongodb-p s4-s

Results
...
[OVERALL], throughput (ops/sec), 19102.39099223948

Scene s5:100% Read

Command:

./BIN/YCSB Run Mongodb-p s5-s

Results
...
[OVERALL], throughput (ops/sec), 49020.29394020022

Differential Slice Testing

Just now we have got the performance of single machine in 5 scenes. Next we can start to test the performance of the differential slice and the different number of differential slices in the scenario.

First we stop the MongoDB database on a single machine.

Next we will build a shard cluster. Here let me recommend to everyone a very handy MongoDB tool: Mtools https://github.com/rueckstiess/mtools
Mtools is a collection of several MongoDB-related tools in which mlaunch helps us to create replica sets or Shard clusters on a single machine effortlessly.

Install Mtools (requires Python and Python's package management tool PIP or Easy_install):

# pip Install Mtools or Easy_install mtools

Then build a new directory and create a differential slice cluster in the new directory:

# mkdir Shard2
# CD Shard2
# mlaunch–sharded 2–single

This command creates 4 processes on the same machine:

* 1 x MONGOs on 27017 ports
* 1 Configuration server Mongod on 27020 ports
* 2 Shard Server Mongod on 27018 and 27019 ports

These four processes comprise a differential slice cluster with two shards. It is worth noting that although we have built a shard cluster, at this time all the data will only go to one of the shards, which is called the primary shard. For MongoDB to distribute data across shards, you must explicitly activate the database that needs to be fragmented and the name of the collection.

# MONGO
Mongos> sh.enablesharding ("YCSB")
{"OK": 1}
Mongos> sh.shardcollection ("Ycsb.usertable", {_id: "hashed"})
{"collectionsharded": "Ycsb.usertable", "OK": 1}

The above two commands activate the Shard function of the "YCSB" Database and the "Usertable" collection in the library, respectively. You also need to specify the Shard key when opening shards on the collection. Here we use {_id: "hashed"} for the Shard key using the hash value of the _id field. Hash value sharding key is suitable for a large number of written scenes, it can distribute the write operation evenly to each shard.

Next we can run the following 5 scenarios sequentially and collect the test results (note the first parameter of YCSB):

./BIN/YCSB Load Mongodb-p s1-s
./BIN/YCSB Run Mongodb-p s2-s
./BIN/YCSB Run Mongodb-p s3-s
./BIN/YCSB Run Mongodb-p s4-s
./BIN/YCSB Run Mongodb-p s5-s

After testing, turn off the cluster with the following command:

# Mlaunch Stop

By analogy, it is possible to set up separate directories for 4, 6, and 8 members of a differential slice cluster and repeat the test for 5 scenarios. Here are all the test results:

Conclusion

From the table above we can draw the following conclusions

* Differential slices can significantly improve mongodb concurrency in a suitable application scenario
* Differential slices are not helpful for read-only scenarios
* Differential slice optimization for mixed read and write scenarios (also the most common scenarios in practice) Best: 275%
* 6 micro-shards are already saturated, and adding more shards has not improved significantly. This number may vary from person to person

"Go" Test MongoDB's differential chip performance with YCSB

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.