Details about database paging, memory, and I/O latency _ MySQL

Source: Internet
Author: User
Detailed description of database paging, memory, and IO latency I wrote an article on AIX optimization a few years ago. now I see it in AIX 7, therefore, it is necessary to review the basic tuning measures that need to be performed on the AIX system. Many of the released technical levels (TL) and some suggestions may change. In this article, I will provide AIX tuning information related to callable items in AIX 5.3, 6.1, and 7.

I focus on I/O, memory, and network. By default, AIX 6 and 7 do a good job in memory tuning. you only need to make a few minor adjustments. However, AIX 5.3 requires more tuning in this aspect. Figure 1 shows different callable items and their default settings. The fourth column is the recommended values for these settings of the three latest TL versions.

Figure 1. different callable items and their default settings

  

Keep in mind that the new default value of memory adjustable items is automatically set when the brand new AIX 6 or 7 is installed. If the system is migrated from AIX 5.3, all the adjustable items set in AIX 5.3 will be migrated along with the migration. Before performing the migration, we recommend that you record all the modified callable items (get a copy of/etc/tunables/nextboot) and restore the callable items to the default value. After the migration, check nextboot and make sure there is no content in it. Now we will discuss the adjustable items that need to be modified for AIX 6 or 7.

Paging space

We recommend that you configure multiple paging spaces of the same size on different hard drive (hdisk. All paging spaces should be mirrored, or placed on a RAID (1 or 5) storage area network (SAN. Unless required by the database, the paging space generally does not need to be twice the memory size. I used to run a large Oracle database on AIX with 250 GB memory and three 24 GB paging spaces. The key is to use concurrent I/O (CIO) and other technologies to avoid paging. The paging space is provided to prevent paging.

By default, AIX creates a paging space (hd6) in rootvg, which is too small. If rootvg is mirrored, the paging space is also mirrored. I usually use several custom logical unit numbers (LUNs) from the SAN to add additional paging space. Do not add paging space to the internal disk (or san lun) where the current rootvg paging space is located. Configuring multiple paging spaces on the same hdisk reduces the paging speed.

When building a virtual I/O Server (VIOS), two paging spaces are automatically configured, both of which are on hdisk0. Hd6 is 512 MB, and paging00 is 1,024 MB. I always close and delete paging00 and increase hd6 to 4,096 MB. As mentioned above, it is difficult to configure two paging spaces on the same hdisk.

Page stealing method

In the default settings of AIX 5.3, page_steal_method is set to 0. This affects how the least recently used daemon (least recently used daemon LRUD) daemon can be used to scan and release pages. Setting lru_file_repage = 0 means we strongly recommend that LRUD not steal executable code pages, and always try to steal file system (persistent) pages. The cost of stealing a persistent page is much lower than that of stealing a work storage page, because the latter will cause a page to be swapped out/in. Assuming 100 GB memory and five memory pools are used, the memory is divided into five pools of about 20 GB, and each LRUD processes about 20 GB (this is a very simplified description ). According to the numclient value in figure 2, we can assume that about 45% of the memory is used for the file system, that is, about 45 GB; the other 55 GB is the working storage.

Figure 2. vmstat output

  

If page_steal_method is set to 0, LRUD has to scan all memory pages they control when searching for idle pages, although it is likely to only release persistent pages. If page_steal_method = 1 is set, LRUD uses the list-based page management solution. This means that LRUD divides the memory into a persistent page list and a work storage page list. When LRUD searches for pages that can be released from the file system cache, they only search for the persistent page list. For the example in figure 2, this should increase the speed of scanning the releasable page by more than twice, which will reduce the overhead. The scan speed and idle rate are displayed in the output of "vmstat-I 2 2.

Memory and I/O buffer

Several commands are useful when exploring the best memory settings, especially vmstat-v. Figure 2 shows partial output of vmstat-v.

There are two types of pages in the memory: persistent pages (associated with the file system) and work storage or dynamic pages (including executable code and its workspace ). If you steal a persistent page, you do not need to swap it out unless the page has been modified (in this case, write it back to the file system ). If you steal the work storage page, you must first write it to the paging data set, and then read it back from the paging data set when you need it next time. This is a very costly operation.

Setting minperm % = 3 and lru_file_repage = 0 means that we strongly recommend that LRUD always try to steal persistent pages when the file system is using more than 3% of the memory. LRUD ignores the maximum setting in most cases, unless it is to limit the amount of memory available to the file system. Maxperm % indicates all persistent pages, including the log File System (JFS), network File server (NFS), Veritas File System (VxFS), and enhanced log File System (JFS2 ). Maxclient % is a subset of them, including only NFS and JFS2 file systems. Maxperm % is a soft limit, and maxclient % is a hard limit (and cannot exceed maxperm % ). Because the new file system is usually JFS2, the maximum setting should be kept at 90% to avoid unexpected restrictions on the memory used by the file system.

In the output of vmstat-v, there are several indicators to help determine which values to adjust. In Figure 2, we can see that numperm and numclient are the same, both of which are 45.1%. This means that NFS and/or JFS2 file systems are using 45.1% of the memory. If this is a database system, I will check whether CIOs are being used, because it can eliminate dual page storage and processing, thus reducing memory and CPU usage.

When building an I/O request, The Logical Volume Manager (LVM) requests a pbuf, which is a fixed memory buffer and stores the I/O request in the LVM layer. Then put I/O in another fixed memory buffer called fsbuf. There are three fsbuf types: file system fsbuf (for use by the JFS file system), client fsbuf (used by NFS and VxFS), and external paging program fsbuf (used by the JFS2 file system ). In addition, psbuf is the fixed memory buffer used for paging space I/O requests.

In Figure 2, the value displayed by the vmstat-v command is the average value since the boot. Because the server may not be rebooted for a long time, you must take two snapshots several hours to check whether these values have changed. Here, they are growing rapidly and need to be optimized.

I/O latency

There are several signs of I/O latency in the output of vmstat-v. I/O latency affects performance and memory. The following describes how to identify the cause of I/O blocking and some common methods to solve the problem.

1468217 pending disk I/OS blocked with no pbuf clearly indicates that one or more unfinished disk I/O are attempting to obtain a fixed memory buffer (specifically pbuf) is blocked. This indicates that a queue appears on the LVM layer. Because AIX cannot obtain the buffer to store I/O request information, the request is delayed. Use the following lvmo command to solve this problem.

Figure 3. lvmo-a output

  

Figure 3 shows the output of the lvmo-a command, which indicates that the pbuf of datavg is insufficient (view pervg_blocked_io_count ). This problem should be corrected only for the volume group being used because these are fixed memory buffers and it makes no sense to set them too large:

Lvmo-v datavg-o pv_pbuf_count = 2048

In general, I will check the current setting. if it is 512 or 1024, it will be doubled when it needs to be increased.

11173706 paging space I/OS blocked with no psbuf. The lvmo command also indicates that pbuf in rootvg is faulty. Looking at the output of vmstat-v, we will find that a large number of paging space I/O requests are blocked because psbuf cannot be obtained. Psbuf is used to save I/O requests on the virtual memory management program (VMM) layer. the absence of psbuf seriously affects performance. It also indicates that paging is being executed, which is a problem. The best solution is to stop the paging. Another method is to increase the paging space.

39943187 file system I/OS blocked with no fsbuf. By default, the system only provides 196 fsbuf for the JFS file system. Before JFS2, you need to greatly increase this limit (often increased to 2048) to ensure that jfs I/O requests are not blocked at the file system layer due to the lack of fsbuf. Even if there is no JFS in the system, sometimes there are about 2,000 blocked I/O requests. However, the above number indicates that a large number of jfs I/O is blocked in the system. In this case, I will adjust JFS and try and transfer the plan to JFS2; in JFS2, CIOs and other technologies can be used for databases. In AIX 6 and 7, numfsbufs is now a restricted parameter in the ioo command.

238 client file system I/OS blocked with no fsbuf. Both NFS and VxFS are client file systems. this line refers to the number of I/O blocks on the file system layer due to the lack of client fsbuf. To solve this problem, we need to study it further to find out whether it is a problem of the VxFS or NFS.

1996487 external pager file system I/OS blocked with no fsbuf. JFS2 is an external paging program client file system. this line refers to the number of I/O blocks on the file system layer due to the lack of fixed memory fsbuf. Previously, adjusting j2_nBufferPerPagerDevice can solve this problem. Now, use the ioo command to increase j2_dynamicBufferPreallocation to correct the JFS2 buffer. The default value is 16. when trying to reduce these I/O blocking, I usually increase it slowly (try 32 ).

Other memory adjustable items

The minfree and maxfree adjustable items in the vmo command also affect paging. They are currently set for each memory pool. you can use one of several commands to find out how many memory pools there are. Depending on the version or TL, use vmo-a, vmo-a-F (for 6.1), or vmstat-v to obtain this information.

If these values are too large, you may see that the value in the "fre" column in The vmstat output is very high, but the page is being executed. The default values are 960 and 1,088. based on other settings, I usually use 1,000 and 1,200. The correct calculation method of minfree and maxfree depends on the setting of j2MaxPageReadAhead, the number of logical CPUs, and the number of memory pools.

Assume that vmstat displays 64 logical CPUs (lcpu) and 10 memory pools. Currently, the default values are minfree = 960 and maxfree = 1088. J2_maxPageReadahead = 128. The calculation process for these settings is as follows:

Min = max (960, (120 * lcpu)/mempools)

Max = minfree + (max (maxpgahead, j2MaxPageReadahead) * lcpu)/mempools)

The lcpu is 64, mempools is 10, and j2_MaxPageReadahead is 128. therefore:

Min = max (960, (120*64)/10) = max (960,768) = 960

Max = 960 + (max (8,128) * 64)/10) = 960 + 819 = 1780

I may rounded up the result to 2,048. Maxfree should be recalculated every time j2_maxPageReadahead is modified. Here, I keep minfree as 960, but increase maxfree to 2,048.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.