Detailed paging, memory, and I/O latency in the database

Source: Internet
Author: User
Tags add file system log min client cpu usage oracle database

A few years ago I wrote an article on Aix tuning, and now that Aix 7 appears, it is necessary to re-examine the basic tuning that needs to be done on AIX systems. Many technical levels (TL) and some recommendations that have been released may change. In this article, I will provide AIX tuning information related to the adjustable items in Aix 5.3, 6.1, and 7.

I focus on I/O, memory, and the network. By default, AIX 6 and 7 do a pretty good job with memory tuning, with just a few minor tweaks to make. However, AIX 5.3 requires more tuning in this area. Figure 1 shows the different adjustable items and their default settings. The last column is the recommended value for these settings for these three versions of the latest TL.

Figure 1. Different adjustable items and their default settings


Remember one important point: when you install a completely new AIX 6 or 7 o'clock, you automatically set the new memory adjustable default value. If you are migrating from Aix 5.3, all of the adjustable items that are set in Aix 5.3 will be migrated with. Before performing a migration, it is recommended that you record all the adjustable items that have been modified (obtain a copy of the/etc/tunables/nextboot), and then revert the adjustable items to their default values. After the migration, check the nextboot and make sure nothing is in it. Now, discuss the adjustable items that need to be modified for AIX 6 or 7.

Paging space

Best practices recommend that you configure multiple paging spaces of the same size on a different, less busy hard drive (Hdisk). All paging space should be mirrored or placed on a RAID (1 or 5) Storage Area Network (SAN). Paging space generally does not need to be twice times the amount of memory unless required by the database. I once ran a large Oracle database on AIX with gigabytes of RAM and three gigabytes of paging space. The key is to use technology such as concurrent I/O (CIO) to avoid paging, and to provide paging space just in case you need paging.

By default, AIX creates a paging space (HD6) in ROOTVG, which is too small. If the ROOTVG is mirrored, the paging space will also be mirrored. I usually add extra paging space using several logical unit numbers (LUNs) of custom size from the SAN. Do not add paging space to the internal disk (or SAN LUN) in which the current ROOTVG paging space resides. Configuring multiple paging spaces on the same hdisk reduces paging speed.

When you build a virtual I/O Server (VIOS), you automatically configure two paging spaces, all on hdisk0. Hd6 is the mb,paging00 1,024 MB. I always close and delete paging00, then add Hd6 to 4,096 MB. As mentioned earlier, it is not a good idea to configure two paging spaces on the same hdisk.

Page Steal method

In the default setting for AIX 5.3, the Page_steal_method is set to 0. This affects how the most recent use of the daemon (least recently used daemon LRUD) scans to release pages. Setting lru_file_repage=0 means strongly recommends that LRUD not steal executable code pages and always try to steal file system (persistent) pages. Stealing a persistent page is much less costly than stealing a work-store page, because the latter causes the page to be swapped out/swapped in. Assuming the use of GB memory and five memory pools, memory is divided into five GB of pools, each LRUD processing approximately GB (this is a very simplified description). Based on the numclient value in Figure 2, it can be assumed that approximately 45% of the memory is used for the file system, approximately GB, and the other GB is the working storage.

Figure 2.vmstat Output


If you set page_steal_method=0, LRUD have to scan all of the memory pages they control when looking for free pages, although it is likely that only persistent pages will be released. If you set Page_steal_method=1,lrud, you will use a list-based page management scheme instead. This means that LRUD divides memory into a list of persistent pages and a list of work-store pages. When LRUD searches for pages that can be freed from the file system cache, they search only the list of persistent pages. For the example in Figure 2, this should increase the speed of the scan-free page by more than one time, which reduces overhead. The scan speed and idle rate are visible in the output of the "Vmstat-i 2 2".

Memory and I/O buffers

There are several commands that are useful when exploring the best memory settings, especially vmstat-v. Figure 2 shows the partial output of the vmstat-v.

There are two types of pages in memory: Persistent pages (associated with the file system) and work storage or dynamic pages that contain executable code and its workspace. If you steal a persistent page, you don't need to change the page unless the page has been modified (in this case, write it back to the file system). If you steal a work store page, you must first write it to the paging dataset, and then read it back from the paging dataset the next time you need it; This is a very expensive operation.

Setting minperm%=3 and lru_file_repage=0 means that it is strongly recommended that LRUD always try to steal persistent pages when the file system is using more than 3% of the memory. LRUD ignores the maximum settings in most cases, unless you want to limit the amount of memory that the file system can use. maxperm% refers to all persistent pages, including the log file system (JFS), the Network file server (NFS), the Veritas file System (VxFS), and the Enhanced log filesystem (JFS2). Maxclient% is a subset of these and includes only NFS and JFS2 file systems. maxperm% is a soft limit, maxclient% is a hard limit (and cannot exceed maxperm%). Because the new file system is usually JFS2, you should keep the maximum setting at 90% to avoid accidentally restricting the amount of memory used by the file system.

In the vmstat-v output, there are several metrics that help you determine which values to adjust. In Figure 2, you can see that numperm and Numclient are the same, all 45.1%. This means that the NFS and/or JFS2 file systems are using 45.1% of memory. If this is a database system, I will check if the CIO is being used because it eliminates dual-page storage and processing, which can reduce memory and CPU usage.

When building I/O requests, the Logical Volume Manager (LVM) requests a pbuf, a fixed memory buffer that holds I/O requests in the LVM layer. The I/O is then placed in another fixed memory buffer called FSBUF. There are three types of fsbuf: File system fsbuf (for JFS file system use), client fsbuf (used by NFS and VxFS), and external paging program FSBUF (used by the JFS2 file system). In addition, there are psbuf, which are fixed memory buffers used for I/O requests to paging space.

In Figure 2, the vmstat-v command displays a value that is the average since bootstrap. Because the server may not reboot for a long time, be sure to take two snapshots in a few hours to check for changes in these values. Here, they grow fast and need to be tuned.

I/O Latency

In the vmstat-v output, there are several indications of an I/O delay. I/O latency can affect performance and memory. Here are some common ways to identify the causes of I/O blocking and troubleshoot problems.

1468217 pending disk I/Os blocked with no pbuf this line clearly shows that one or more unfinished disk I/O is blocked when trying to obtain a fixed memory buffer, specifically pbuf. This indicates that there is a queue on the LVM layer. Because AIX cannot obtain a buffer to store information for I/O requests, the request is delayed. Use the following LVMO command to resolve this problem.

Figure 3. Lvmo–a output


Figure 3 shows the output of the Lvmo-a command, which indicates that the pbuf of the DATAVG is insufficient (see Pervg_blocked_io_count). This problem should only be corrected for the volume group that is being used, because these are fixed memory buffers and it is meaningless to set them too large:

Lvmo-v Datavg-o pv_pbuf_count=2048

Normally, I'll check the current settings, and if it's 512 or 1024, raise it one times as much as you need it.

11173706 paging spaces I/Os blocked with no psbuf. The Lvmo command also indicates a problem with the pbuf in the ROOTVG. If you look at the output of vmstat-v, you will find that a large number of paging space I/O requests are blocked because PSBUF cannot be obtained. Psbuf is used to save I/O requests on the virtual Memory Management program (VMM) layer, and lack of psbuf can severely affect performance. It also indicates that paging is being performed, which is a problem. The best solution is to stop what is causing the paging. Another way is to increase the paging space.

39943187 file system I/Os blocked with no fsbuf. By default, the system only provides 196 fsbuf for the JFS file system. Before JFS2, this limit needs to be significantly increased (often to 2048) to ensure that JFS I/O requests are not blocked on the file system layer due to a lack of fsbuf. Even in the absence of JFS in the system, there are times when blocked I/O requests reach 2,000 or so. However, the above figures indicate that there are a large number of JFS I/O blocked in the system, in which case I adjust the JFS and try and plan to move to JFS2, and in JFS2, you can use technologies such as CIOs for the database. In AIX 6 and 7, Numfsbufs is now a restricted parameter in the Ioo command.

238 Client file system I/Os blocked with no fsbuf. Both NFS and VxFS are client file systems, which refers to the number of blocked I/O on the file system layer due to a lack of client fsbuf. To resolve this issue, further research is needed to find out whether this is a VxFS issue or an NFS issue.

1996487 External Pager file system I/Os blocked with no fsbuf. JFS2 is the external partial-page program client file system, which refers to the number of blocked I/O on the file system layer due to the lack of fixed memory fsbuf. In the past, adjusting J2_nbufferperpagerdevice can resolve this issue. The JFS2 buffer is now corrected by using the Ioo command to raise the j2_dynamicbufferpreallocation. The default setting is 16, and when trying to reduce these I/O blocking, I usually increase it slowly (try 32).

Other Memory Adjustable items

The Minfree and Maxfree adjustable items in the VMO command also affect pagination. They are now set for each memory pool, and you can use one of several commands to find out how many pools of memory there are. Depending on the version or TL, this information should be obtained using VMO-A, Vmo-a-F (for 6.1) or vmstat-v.

If these values are set too large, you may see a high value in the "fre" column in the Vmstat output, while the paging is being performed. The default settings are 960 and 1,088, and according to other settings, I usually use 1,000 and 1,200. The correct calculation method for Minfree and Maxfree depends on the J2maxpagereadahead settings, the number of logical CPUs, and the number of memory pools.

Here, suppose Vmstat shows 64 logical CPUs (LCPU) and 10 memory pools. is currently set to the default value, that is, minfree=960 and maxfree=1088. j2_maxpagereadahead=128. For these settings, the calculation process is as follows:

Min=max (960, (120*LCPU)/mempools)

Max=minfree + (max (maxpgahead,j2maxpagereadahead) *lcpu)/mempools)

LCPU is 64,mempools for the 10,j2_maxpagereadahead of 128, so:

Min=max (960, (120*64)/10) = max (960,768) =960

Max=960+ ((Max (8,128) *64)/10) = 960+819=1780

I might have taken the result up to 2,048. The maxfree should be recalculated whenever the J2_maxpagereadahead is modified. Here, I keep minfree for 960, but raise the Maxfree to 2,048.

Related Article

Alibaba Cloud 10 Year Anniversary

With You, We are Shaping a Digital World, 2009-2019

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.