MongoDB Performance Test and Python test code

Source: Internet
Author: User

Recently, I participated in a company project and planned to quickly respond to large-scale queries on the online platform. The estimated total data volume is about 2-3 billion records, the database concurrency is about 1500 per second, and the concurrency is about 3000 per second after one year, after a difficult choice between Redis and mongodb, I decided to use mongodb, mainly depending on its parallel scalability and Map/Reduce on GridFS. The estimated number of concurrent queries per second during peak hours is between-after the project is launched.
In fact, I personally like Redis, and its concurrent query capability and speed beyond memcached are very exciting. However, its persistence and cluster scalability are not suitable for business needs, so I finally chose mongodb.
The following is the code and result of the mongodb test. Although the company uses CentOS, as I am a supporter of FreeBSD, I tested the results on FreeBSD and CentOS.
The database writing program is copied online, and the query program is self-written.
Write database program
#! /Usr/bin/env python

From pymongo import Connection
Import time, datetime

Connection = Connection ('1970. 0.0.1 ', 127)
Db = connection ['hawaii']

# Time Recorder
Def func_time (func ):
Def _ wrapper (* args, ** kwargs ):
Start = time. time ()
Func (* args, ** kwargs)
Print func. _ name __, 'run: ', time. time ()-start
Return _ wrapper

@ Func_time
Def insert (num ):
Posts = db. userinfo
For x in range (num ):
Post = {"_ id": str (x ),
"Author": str (x) + "Mike ",
"Text": "My first blog post! ",
"Tags": ["mongodb", "python", "pymongo"],
"Date": datetime. datetime. utcnow ()}
Posts. insert (post)

If _ name _ = "_ main __":
# Set the cycle to 5 million times
Num = 5000000
Insert (num)
Query Program
#! /Usr/bin/env python

From pymongo import Connection
Import time, datetime
Import random

Connection = Connection ('1970. 0.0.1 ', 127)
Db = connection ['hawaii']

Def func_time (func ):
Def _ wrapper (* args, ** kwargs ):
Start = time. time ()
Func (* args, ** kwargs)
Print func. _ name __, 'run: ', time. time ()-start
Return _ wrapper

# @ Func_time
Def randy ():
Rand = random. randint (1,5000000)
Return rand

@ Func_time
Def mread (num ):
Find = db. userinfo
For I in range (num ):
Rand = randy ()
# Random number query
Find. find ({"author": str (rand) + "Mike "})

If _ name _ = "_ main __":
# Set the cycle to 1 million times
Num = 1000000
Mread (num)
Delete a program
#! /Usr/bin/env python

From pymongo import Connection
Import time, datetime

Connection = Connection ('1970. 0.0.1 ', 127)
Db = connection ['hawaii']

Def func_time (func ):
Def _ wrapper (* args, ** kwargs ):
Start = time. time ()
Func (* args, ** kwargs)
Print func. _ name __, 'run: ', time. time ()-start
Return _ wrapper

@ Func_time
Def remove ():
Posts = db. userinfo
Print 'count before remove: ', posts. count ();
Posts. remove ({});
Print 'count after remove: ', posts. count ();

If _ name _ = "_ main __":
Remove ()

Result set

Insert 5 million Random Number query 1 million Delete 5 million CPU usage
CentOS 394 s 28 s 224 s 25-30%
FreeBSD 431 s 18 s 278 s 20-22%


CentOS insertion and deletion won; FreeBSD played the advantage of UFS2 and won the read. Because it is used as a query server, fast reading speed is an advantage, but I am not a leader. If I say no, I will eventually get CentOS.
During the test, we have been using mongostat monitoring. The number of concurrent jobs is similar to that of the two systems. The insert concurrent query is also tested, but the result is similar. The sum of the concurrency values is 15000-25000 per second. The performance is still very good.
However, it is true that the insertion performance decreases significantly in the case of large data volumes. CentOS Tests 50 million data insertion, which takes nearly 2 hours. It takes more than 6300 seconds. The data insertion speed is almost 5 million slower than that of 50%. However, the query speed is almost the same.
The test results are provided as a reference for beginners.
However, this test is not fair. FreeBSD configuration is worse.
CentOS 16 GB memory, Xeon5606 two 8 cores. Dell brand machine.
FreeBSD 8 GB memory, Xeon5506 one 4-core. There is no brand 1U.
In the same environment, I think FreeBSD has better performance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.