Analysis on the hardware bottleneck of MySQL database performance optimization _ MySQL-mysql tutorial

Source: Internet
Author: User
Tags mysql tutorial
Hardware bottleneck analysis of MySQL database performance optimization bitsCN.com

In the past communication with many people, we found that when talking about the hardware-based database performance bottleneck analysis, it is often misunderstood that the use of more powerful hosts or storage to replace existing devices.

I personally think that there may be a very large misunderstanding. When we talk about hardware-based optimization, we should not just divide the hardware used by the database into two parts: host and storage, but further break down the hardware, at least it should be decomposed into the following categories:

Host

1. CPU: only the computing speed can be determined. even the computing speed depends on the bus bandwidth between the memory and the memory speed.

2. memory: the size determines the amount of data that can be cached and the access speed of hotspot data.

3. Disk:

3.1: determines how much data you can store.

3.2 rotation speed: determines the delay time of each IO request, that is, the IOPS and MBPS we often call.

3.3 disks: the number of disks is determined.

3.4 type

3.4.1 machinery: SAS or SATA or FC?

3.4.2 SSD: disk or PCI card?

4. RAID card:

4.1 cache: the cache size has a significant impact on the data writing speed, and the use of policies also directly affects the IO efficiency.

4.2 Battery: the battery charge/discharge policy affects transient IO fluctuations.

5. Others, such as bus bandwidth, determine the data transmission efficiency between the CPU and memory, which is often less concerned, but may also cause bottlenecks.

Storage

1. memory: storage devices also have memory to store hotspot data accessed by front-end hosts. The memory size also determines the access speed of hotspot data.

2. Disk: similar to the host disk.

3. line/loop bandwidth: The Loop bandwidth must be able to match the disk bandwidth, at least not less than the capacity that the disk can output. Otherwise, you will want to be congested on the high-speed toll station and wait for the vehicle to pass.

Network

1. latency: the latency varies with network devices. For OLTP devices, the smaller the latency, the better.

2. throughput: for a database cluster, the network throughput between nodes may directly determine the processing capacity of the cluster.

3. iops.

The processing capability provided by the hardware must be determined by the multiple aspects listed above (here only the main part, and there may be others). a bottleneck occurs in any aspect, this can lead to poor overall performance, that is, the principle of barrel.

In the past, performance bottlenecks are most likely to occur in the following aspects:

I/O resource bottlenecks

When I/O resource bottlenecks occur, the main cause is that the iowait on the server is very high, the usr accounts for a relatively small proportion, the system response is slow, and the database usually has a large number of execution state sessions.

When we encounter IO resource bottlenecks, we can use the following hardware-level optimization solutions:

1. increase memory to increase the amount of data that can be cached: whether this solution works depends on the total amount of hotspot data in the system. after all, the memory cost is relatively high, and the amount of memory that a single device can manage is limited.

2. improve the I/O capability of the underlying storage device: as described earlier in this article, the improvement of the underlying storage capacity also depends on multiple aspects, including the capacity of a single disk and the number of disks, it is also limited by the bandwidth between the storage itself and the storage and the host. Therefore, we need to consider these three factors while optimizing the underlying storage capabilities, so as to balance the overall analysis with local data.

CPU resource bottlenecks

When CPU resources encounter bottlenecks, the main performance is that usr accounts for a high proportion of server CPU utilization, while iowait is very small. Most of these problems occur when the data volume is not too large and there is enough memory to cache the data. It is also the database performance bottleneck facing most small and medium-sized websites.

When there is a CPU resource bottleneck, it may be caused by two reasons:

1. rely too much on the database for logical operations: in this case, the best optimization method is to migrate the operations from the database end to the application end as much as possible, reducing the computing workload of the database host. After all, the cost of resizing stateful system devices (databases) is much higher than that of stateless system devices (applications ). Of course, if you have to solve the problem from the database hardware, you can only replace the old host by increasing the number of CPU devices (if supported) or by using a host with higher CPU capabilities.

2. database logic IO is too large: In this case, what you can do from the hardware point of view is to improve the CPU processing capability. Either increase the number of CPUs (if supported) or change to a host with more powerful CPU. However, before that, we recommend that you first try to optimize it from the application perspective to see if it can minimize unnecessary requests or reduce the data volume of each request. At the same time, we optimize and adjust the Schema structure and index from the database perspective, so as to minimize the amount of data to be retrieved for a request, so as to reduce the logic I/O.

Network resource bottlenecks

In general, the network interaction between applications and databases requires a lot of resources, so the bottleneck in this environment may not be great. However, in a distributed cluster environment, the network environment between database nodes is often referred to as a system bottleneck.

In common scenarios such as MySQL Cluster or Oracle RAC, the advantages and disadvantages of the data exchange network environment between nodes may directly affect the overall processing capability of the system, because there will be a large amount of data exchange between nodes, they all rely on network transmission.

In such a scenario, a cheaper solution is to replace the commonly used Gigabit switch with a 10-ge switch to improve network processing capabilities and reduce network latency. However, this solution mainly improves the throughput. the latency improvement may not necessarily meet certain very high requirements. At this time, we should consider a more expensive but more efficient solution: replace the common switch with Infiniband to greatly reduce the data exchange delay caused by the network.

The above only analyzes the bottlenecks of major types of hardware resources and provides possible solutions. You are welcome to discuss them together, or include the "hot" SSD. I may write another article about SSD later. I have been familiar with SSD in China and officially applied it to the core product environment, but there may be fewer people than I did.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.