Brief introduction
The moderately sized enhanced log file system (JFS2) Inode cache is critical to achieving the high performance and stability of the IBM AIX system. Typically, the user will control the maximum memory usage of the inode cache by tuning j2_inodecachesize. The inode cache size can also be changed through a dynamic memory reconfiguration (DR) operation. In Aix 6.1 (over 6100-04) and AIX 7.1, there is a hidden side effect that, after performing j2_inodecachesize tuning operations or dynamic logical partition (DLPAR) memory operations, the maximum heap size of the Inode cache class can only be lowered. In this article, we will demonstrate how this side effect causes the Inode cache to run out, and also introduce some ways to deal with this type of problem.
JFS2 Inode Cache
An inode is the JFS2 infrastructure. Each inode has a 512-byte data structure on the disk. When an inode is working in memory, JFS2 tracks more than just the fields on the disk. The core inode (including the portion of the disk and the working part) is currently approximately 1 KB. The AIX kernel caches all of this data to improve performance.
To prevent multiple processor contention, the Inode cache is split into multiple cache classes. The AIX kernel creates two cache classes per processor, as well as another cache class, that is, a total [n (processor number) * 2 + 1] Cache class is created when the system initializes. The Icacheclass and Icache structures are defined in/usr/include/j2/j2_inode.h:
typedef struct ICACHECLASS {mutexlock_t cc_lock; Int32 cc_ninode; * * of Inode in Cachelist * * Cdll_header (inode) cc_cachelist; /* cachelist header */struct pile *cc_pile; /* Inode Pile * * boolean_t pilefull;
/* Pile is full */} icacheclass_t; struct Icache {int32 ninode; * * of In-memory inode/Int16 ncacheclass;
* * of Cacheclass * * struct icacheclass *cachetable; Int32 ninodepercacheclass; * * of inode per cacheclass * * * int32 nhashclass; * * of HashClass-1 * * Int32 nnewhashclass; * * of HashClass-1 * * Int32 ninodeperhashclass;
* * of inode per hashclass/struct Ihashclass **hashtable; Int32 npagespercacheclass; * # of pile pages per cacheclass * * Int32 nmaxinode; /* Ninode at initialization time */};
Provides a heap for inode allocations for each cache class:
struct Pile {
eye_catch_t pile_eyec; * * 8:pile eye-catcher * *
uint32_t flags; /* 4:guarded by Pile_lock * *
uint16_t obj_size; /* 2:opaque Object Size *
/uint16_t align; /* 2:object align (offset mask) * *
uint16_t slab_size; /* 2:alloc slab size in pages
/* ... uint64_t max_total_pages; /* 8:max Total pages, ideally * *
uint64_t min_total_pages;
uint64_t cur_total_pages; /* 8:real World Value *
/...};
You can configure the maximum number of pages for a heap. This max_total_pages field determines how many inode to allocate to fill a heap and begin reclaiming the inode from the cached list. Can be forced to shrink the heap, or you can enlarge it. This procedure is performed during memory DR and J2_inodecachesize throttling.
Adjust J2_inodecachesize
The inode cache size can be adjusted by using the Ioo command to change j2_inodecachesize tunable. In Aix 6.1, this value defaults to 400, and in AIX 7.1 defaults to 200. The value does not explicitly indicate the amount of cache to be used, but only a scaling factor. It can be used in conjunction with the size of the primary memory to determine the maximum memory usage for the inode cache. The current formula is:
(inode cache memory) = (System memory) * (J2_INODECACHESIZE)/4000
We can run the following command to display the current value of j2_inodecachesize:
#ioo-a |grep j2_inodecachesize
J2_inodecachesize = 400
You can use the KDB command to obtain detailed information about the Inode cache:
(0) > I2-c
icache:
ninode: 0xb3306 (733958)
Nmaxinode: 0xb3306 (733958)
Ncacheclass:
Nhashclass: 0xFFFF (65535)
Nnewhashclass: 0xFFFF (65535)
cachetable: 0xf10001003b4fc000
hashTable: 0xf10001003b54b000
Cache table:
CLASS LOCK inodes cachelist. Head pile full
0 0 f10001003fd92080 f10001003b502300 0
1 0 273 f10001003d4a2880 f10001003b502600
0
... 0 f10001003ff17880 f10001003b503400 0
(0) > DW icache
icache+ 000000:000b3306 00110000 F1000100 3b4fc000 . 3.........;o.
. Icache+000010:0000a8a6 0000FFFF 0000FFFF 00000010
......... icache+000020:f1000100 3b54b000 000029d5 000b3306 ... ; T ...) ... 3.