Prepare for first-time deployment MongoDB: capacity Planning and monitoring

Source: Internet
Author: User
Tags memory usage mongodb prepare require

If you have finished developing your new MongoDB application and are now ready to deploy it into the product, you and your operations team need to discuss some key issues:

What is the best deployment practice?

What key metrics do we need to monitor to ensure that the application meets the service levels it needs?

How can I determine when to add a fragment?

What tools are available to back up and restore the database?

How can I secure access to all the new real-time large data?

This paper introduces hardware selection, expansion, ha and monitoring. Before you look at the details, let's first deal with one of the most common problems:

What is the difference between deploying MongoDB and deploying an RDBMS?

You will find MongoDB as a document database that shares many of the same concepts, operations, policies, and processes with the familiar relational database. Processes and best practices for monitoring, indexing, tuning, and backup can be applied to MongoDB. At the same time, if you want to start your own training, you can get free online courses from developers and DBAs from MongoDB University.

System performance and capacity planning are two important themes, and any deployment needs to address both issues, both RDBMS and NoSQL databases. As part of our planning, we should establish baselines for data volumes (volume), System load, performance (throughput and latency), and capacity utilization. These baselines should reflect your expectations of the workloads that the database performs in the product environment, and they should be adjusted periodically with changes in user numbers, application functionality, performance SLAs, or other factors.

The baseline will help you understand when the system is running according to design, and when the problems that may affect the quality of the user experience or other decisive system factors begin to emerge.

The key deployment elements, including hardware, extensions, and ha, are discussed below, along with what you should monitor to maintain the best system performance.

Clear your own set of work

When you optimize your hardware budget for deployment MongoDB, RAM should be or close to the first place in the list.

RAM is widely used in MongoDB to implement a low latency database operation. In MongoDB, all data is read and manipulated through a memory-mapped file. Reading data from memory is measured in nanoseconds, while reading from disk is measured in milliseconds, so reading data from memory is almost 100,000 times times faster than reading from disk.

The collection of the most frequently accessed data and indexes during normal operation is called the working set, which ideally should be in RAM. The working set may be a small part of the entire database, such as the application data associated with the most recent event or the most frequently accessed hot product.

MongoDB the page error that occurred while attempting to access the data is not loaded into RAM. If there is free memory, the operating system navigates to the pages on the disk and loads them directly into memory. However, if there is no free memory, the operating system must write a page in memory to disk and then read the requested page into memory. This process is slower than accessing data that already exists in memory.

Some operations may inadvertently clear a large number of working sets from memory, which can have a serious impact on performance. For example, for a query that browses all documents in a database, if the database is larger than the RAM on the server, it causes the document to be read into memory and the working set is written to disk. Defining an appropriate index for your query during the project's schema design phase will greatly reduce the likelihood of this risk happening. The MongoDB description operation provides information for the use of query plans and indexes.

The MongoDB Service Status command contains a useful output: The working Set document, which provides an estimated size for a MongoDB instance working set. The operations team can track the number of pages accessed by the instance at a given time, including the elapsed time between the oldest document in the work set and the latest document. By tracking These metrics we are able to find out when the working assembly is approaching the current RAM limit and actively taking action to ensure that the system is extensible.

MongoDB Management Services and Mongostat can help users monitor memory usage, which we'll discuss in detail below.

Storage and disk I/O

MongoDB does not require shared storage (for example, a storage area network). MongoDB can use locally attached storage and solid-state drives (SSD).

Most of the disk access patterns in MongoDB do not have sequential attributes, as a result of which customers can gain significant performance benefits by using SSD. We have observed good results and strong performance with SATA SSD and PCI. Commercial SATA rotary drives are comparable to higher-cost rotating drives, thanks to the MongoDB access pattern: More efficient use of the budget for more RAM or SSD rather than more expensive rotating drives.

While data files benefit from SSDs, MongoDB's journal files are a good candidate for fast regular disks because of their high order write properties.

Most MongoDB deployments should use RAID-10. RAID-5 and RAID-6 do not provide sufficient performance. RAID-0 provides good write performance, but has limited read performance and insufficient fault tolerance. Deployed MongoDB can provide strong data availability through a replica set (discussed below), while users should consider using RAID and other factors to meet the SLA availability that they want.

Although we should design the MONGODB system so that its working set is suitable for memory, disk I/O is still a key performance consideration. MongoDB periodically flushes writes to disk and submits them to the journal, so the underlying disk subsystem can become overwhelmed when the load is heavy. The Iostat command can be used to display high disk utilization and excessive write queues.

CPU selection-speed or kernel?

MongoDB performance is usually not bound to the CPU. Because MongoDB rarely encounter workloads that require the use of a large number of cores, the best choice for a multi-core server with a slower clock speed is to have a faster clock speed.

Regardless of the system, measuring CPU utilization is very important. If you observe high CPU utilization but do not have other problems such as disk saturation or page faults, there may be an unusual problem in the system. For example, a mapreduce work with an infinite loop or a query that sorts and filters a large number of documents in a working set without a good index can cause a spike in CPU utilization, but they do not cause disk system problems or page faults. The tools used to monitor CPU utilization are described below.

Extending the database-when and how to extend it?

MongoDB provides the ability to scale horizontally through a technology called sharding. Sharding can distribute data between multiple physical partitions, called slices. Sharding allows MONGODB deployments to address the hardware limitations of a single server without adding to the complexity of the application, and addresses hardware constraints that include RAM and disk I/O bottlenecks.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.