Testing MongoDB Shard cluster performance using YCSB

Source: Internet
Author: User

1. Test Tools

This test chooses YCSB (Yahoo! Cloud System Benchmark) as the test Client tool. YCSB is a NoSQL test tool for Yahoo Open source, used to test the performance of a variety of NoSQL, project address: Https://github.com/brianfrankcooper/YCSB. The project's MongoDB directory has detailed installation and testing methods.

YCSB supports common NoSQL database reads and writes, such as inserting, modifying, deleting, reading, etc. It can use multithreading to improve the performance of clients. Can easily customize a variety of scenes, such as 95% insert 5% Read, or 90% read 5% update 5% Insert and so on. You can customize the way data requests are distributed: Average distribution, Zipfian (approximately 20% data get 80% access requests), up-to-date data.


2. Test steps

1. Select the number of client threads. Using the YCSB test, to choose a suitable number of threads, otherwise the test bottleneck may be in the client instead of the database, after comparison, about 100 threads, YCSB achieve maximum performance.


2. Define the test scenario. The scenario for this test is as follows:

Workloada Write more than read, 90% Insert, 5% read, 5% update.
Workloadb Read more write less, 95% read, 5% update.
Workloadc Read more and write less, 100% read.
Workloadd Read more and write less, 95% reads, 5% insertions.
Workloadf Read and write less, 50% reads, 50% read and write modify the same record.
Workloadg Read and write less, 60% reads, 20% reads, 20% updates.



3. Test the various scenarios under different quantity records. Divided into two stages:


1), load. Load the data. The command is:

./BIN/YCSB Load mongodb-threads 100-s-P workloads/workloada-p mongodb.url=mongodb://mongos: 28000/ ycsb?w=0 > OutputLoad.txt

When the load command is executed, only the RecordCount parameter works, such as recordcount=60000000, which indicates that 60 million records are loaded. RecordCount does not work when the Run command is executed. MONGOs is the IP address of the MONGOs instance in the cluster.


2), run. After the load data is complete, the tests are run in various scenarios.

Test Scenario Workloada, located in the workloads directory:

./BIN/YCSB Run mongodb-threads 100-s-P workloads/workloada-p mongodb.url=mongodb://mongos: 28000/ ycsb?w=0 > OutputRun.txt

Delete data from the last test before each load data, including data for each shard, configuration server, MONGOs, etc.


3. Test system Architecture

3 Instances of the cluster configuration server are deployed on configs servers, Ycsb,mongos instances, and SHARD1,SHARD2 are deployed on a single server. Shard1 and Shard2 are separate MongoDB instance, not replicate set. MongoDB uses version 2.6.


4. Configuration of the server < Span style= "FONT-SIZE:14PX;" >intel (R) Xeon (r) CPU E5645 @ 2.40GHz 1 cores " TD style= "Word-break:break-all;" >

OS Cpu Ram
YCSB ubuntu14.04 intel (R) Core (TM) i5-4440 CPU @ 3.10GHz 4 core 1g
mongos 8g
Shards Red Hat 4.4 Intel (R) Xeon (r) CPU E5645 @ 2.40GHz 1 Core 26d
Configs ubuntu14.04 Intel (R) core (TM) i5-4440 [email protected] 1 core 1G

5. Test results

Table 1, a shard, 1 million, 10 million, 60 million, 100 million records the throughput of each scene (OPS/SEC).

Table 2, two shards, 1 million, 10 million, 60 million records the throughput of each scene (OPS/SEC).

workloadb TD style= "Word-break:break-all;" >
data volume (million) workloada < Span style= "LINE-HEIGHT:22.5PX; font-size:14px; " >WORKLOADC Workloadd WORKLOADF WORKLOADG
1 4878 7536 7885 2131 5986
10 4343 7442 6996 2 164 6119
60 1669 7242 7847 7810 2577 6054
100 157 7333 6796 7766 2082 4389

Table 1


Data volume (million) Workloada Workloadb Workloadc Workloadd Workloadf Workloadg
1 6462 7416 7518 7633 2622 6777
10 5826 8198 7664 2093 7376
60 5662 7707 7546 2181 6540







Table 2


6. Test results Analysis

Figure 1, a shard when the throughput of each scene with the change in the record volume curve.

Figure 2, two shards when the throughput varies with the record amount in each scene.

Figure 3, rewrite the scene (Workloada) The throughput of the number of different shards with the change curve of the record amount.

Figure 4, reread the scene (WORKLOADB) The throughput of different shards quantities varies with the number of records in the curve.

Figure 1


Figure 2


Figure 3


Figure 4

As can be seen from Figure 1,workloada, MongoDB's write performance first reached the bottleneck, as the number of records increased, decreased quickly, and read performance changes are very small.

from Figure 3 and Figure 4, it can be seen that when MongoDB encountered write bottlenecks, increase the Shard, greatly increase the write performance, a small increase in read performance.

There is no bottleneck in MongoDB read performance in the test due to data volume, or YCSB bottlenecks.

7. Conclusion

1.Mongodb reading performance is very high, suitable for rereading the scene.

2. By adding shards, you can greatly increase the write performance of the MongoDB cluster, and partly increase the read performance.

3. The advantages of MongoDB compared to relational databases

      • Document-based database, JSON-style file storage, clear structure, no ORM.

      • Replication and high availability for ease of expansion.

      • Automatic sharding

      • Using the document-based query language, there is a certain ability to query.

      • Any property can be indexed.

So for a less complex query scenario, MongoDB can be handy as an alternative to MySQL, improving the read and write capabilities of the DB. For big data scenarios, the content management and delivery platform, user Data Management Center, log platform, and so on are suitable for using MongoDB.


Testing MongoDB Shard cluster performance using YCSB

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.