Recently, during the test, we found that during map reduce data analysis, the CPU sys occasionally rose to 50% or even, causing severe jitter.
The disk Io throughput of the node is relatively large, reaching 150 MB per second. Most of the operations are tasktracker reading DFS data from the local node. By default, the system calls to read 4 K data from the hard disk each time to the kernel space. Then, the kernel copies the data to the application.ProgramSpace, which is what causes the kernel to consume most of the time.
After a large number of various aspects of monitoring, it is found that the CPU sys suddenly rose to the time when the file system cache released the memory, it is very regular, especially when the available memory of the operating system is relatively tight,
Therefore, It is inferred that the kernel needs to perform complex collection when the file system cache releases some space.AlgorithmSwap some infrequently used pages to the hard disk, and because the mapreduce process is a large number of sequential Io, the cache consumption of the file system is extremely high, the cache space is also released frequently. In the case of VM cache drop, during mapreduce operation, the cache has been increasing the non-Cache switching mode, and the CPU sys consumption is small and stable, further verifying this problem.
The file system cache will be optimized later.