Linux Kernel Module Memory leak lookup (2)

Source: Internet
Author: User

In a previous blog post <<Linux kernel module, a way to find out about memory leaks >> , I introduced a method of finding kernel memory leaks. This is a few months, and customers complain: using the product 5 days or so, the SUSE server due to memory exhaustion and crash. O my God, don't, run on my machine good WOW (programmers commonly used sayings hehe). So let's take a look at how the hard-pressed blogger determines the problem and finds the problem ....

I. Identifying the problem

The first step is to make sure that the problem is related to the kernel module of the product. First, according to the customer's description, if we stop our product, there will be no memory leak issue. That determines the problem is related to our products, but it is related to the user-state program or kernel module program? According to customer-provided kernel dump view slab occupies 3.6G. So in all likelihood, is the product kernel module exists memory leak.

++++++++++++++++++++++++++++++
crash> kmem-i
PAGES Total PERCENTAGE
Total MEM 981585 3.7 GB----
Free 24987 97.6 MB 2% of Total MEM
used 956598 3.6 GB 97% of Total MEM
SHARED 184 KB 0% of Total MEM
buffers 144 KB 0% of Total MEM
CACHED KB 0% of Total MEM
SLAB 941424 3.6 GB 95% of Total MEM

Total SWAP 1048575 4 GB----
Swap used 527 2.1 MB 0% of Total SWAP
Swap free 1048048 4 GB 99% of Total SWAP
++++++++++++++++++++++++++++++


But a programmer, before the self-confident said "run on my Machine good", then the hard to beat their own face it!!! Before hitting the face or shameless to introduce our Kernel module: This module is called KHM (Kernel hook module), open source, the Linux file operation to Hook, and pass the file information to the user state for file scanning.

Bloggers write a script that constantly copies files and simulates a lot of I/O operations, which constantly triggers the hook function that invokes the kernel module of the product. To log memory usage before testing, use the following command to clear the system using the cache:

It then logs memory usage, primarily recording free memory and slab use of memory:

+++++++++++++++++++++++++++++++++++++++++++++

suse11x64-001:~ # Cat/proc/meminfo
memtotal:1989340 KB
memfree:1495368 KB
......
slab:37752 KB

......

+++++++++++++++++++++++++++++++++++++++++++++
Then wait 3 days (just over the weekend), using the same method as above to view the current free memory and slab use memory, and finally found that 3 days to consume about 300M of memory, just about slab growth of memory.  This calculates the memory leak rate is probably 4.2 m/hour. That is, if a large number of I/O operations are not simulated through a script, there will be a smaller memory Leak rate, which is really not easy to discover. Since the problem is determined, then the memory leak analysis.


Two. Problem analysis

Before analyzing the problem, we analyzed which type of cache in the customer-supplied kernel Dump,slab took up too much memory: Sock_inode_cache took up about 1.8G of memory and Dentry about 700 m of memory.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Crash> kmem-s

CACHE NAME OBJSIZE Allocated Total slabs ssize

......

ffff880138431300 Sock_inode_cache 640 2842421 2842524 473754 4k

......

ffff880138c00e00 dentry 192 3769490 3769880 188494 4k

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Sock_inode_cache the kernel structure of the socket is stored in the kernel, and dentry is the data structure of the file or directory in the kernel, if you and I are not particularly proficient in the kernel of Linux, then the primary suspect is dentry. The dentry of the file is accessed in the kernel module, so how does it cause a memory leak? At this time there are the following two suspected ideas:

(1) using Kmalloc and other APIs to request memory space and then not released;

(2) After dentry reference access, there is no release of its reference count, such as after calling Dget , and there is no corresponding call to dput.

The case (1) was then ruled out by code review, but the case (2) was also viewed, and it was found that after accessing Dentry, Dput was called to reduce the reference count once. This problem has been deeply troubled me, one weeks have been reluctant to look at this problem, but the problem is always resolved ah? Also think of some methods, such as using kmemleak? But to recompile all of the SuSE kernel source code, and not necessarily be able to very clearly query to the memory leak reason, given our product kernel module code is not very large, the final decision, once again code Review. 2 days and a half of time, Kung Fu is not a conscientious, finally found the root cause!


Three. Root cause

The program execution process is as follows:

(1) According to file FD, get the file object, get the Path object from the file object, and use the path pointer ppath to record the Path object address (the Path object has the Dentry and Vfsmount member pointers).

(2) The program needs to access Dentry and Vfsmount, so the use of Path_get (Ppath) on the reference count plus a

(3) Call the original close in the system to close the file

(4) After a series of operations, using Path_put (ppath) to Dentry and Vfsmount reference count minus one

The question is on step (3) and step (4): If only one process opens and accesses the file, then closes the file, then enters the hook function of our product, and when it enters the third step, calls the system's original close, and the kernel will call the following procedure: CLOSE->FILP _close->fput->__fput. You can see that in Fput, the __fput is called if the current file object has a reference count of only 1.

void fput (struct file *file) {        if (atomic_long_dec_and_test (&file->f_count))                __fput (file);}
in general, the file object now has a reference count of 1(exceptions such as a process to open a file and fork), call __fput, and note that its dentry and mnt pointers are set to null.

static void __fput (struct file *file) {struct Dentry *dentry = file->f_path.dentry;        struct Vfsmount *mnt = file->f_path.mnt;        struct Inode *inode = dentry->d_inode;        Might_sleep ();        Fsnotify_close (file);         /* The function eventpoll_release () should is the first called * in the file cleanup chain.        */eventpoll_release (file);        Locks_remove_flock (file);                        if (Unlikely (File->f_flags & Fasync)) {if (File->f_op && file->f_op->fasync)        File->f_op->fasync ( -1, file, 0);        } if (File->f_op && file->f_op->release) file->f_op->release (inode, file);        Security_file_free (file);        Ima_file_free (file); if (Unlikely (S_ISCHR (inode->i_mode) && Inode->i_cdev! = NULL &&! ( File->f_mode & Fmode_path)) {Cdev_put (inode->i_cdev);        } fops_put (FILE-&GT;F_OP);        Put_pid (FILE-&GT;F_OWNER.PID);        File_sb_list_del (file); if (File->f_mode & (Fmode_read |        fmode_write) = = Fmode_read) I_readcount_dec (inode);        if (File->f_mode & fmode_write) drop_file_write_access (file);        File->f_path.dentry = NULL;        File->f_path.mnt = NULL;        File_free (file);        Dput (Dentry); Mntput (MNT);}
Because in step (2) We have a reference count plus 1 for dentry and MNT (at which point the reference count is 2), calling Dput and Mntput in __fput will only subtract one of its reference counts, but using memory will not be released. In theory, it should be released after the operation we have defined is completed in step (4). But!!! we called Path_put ( Ppath) to release at step (4), but at this point, the dentry and mnt pointers in the original Close,path were already set to null, so at step (4), Dentry and MNT did not make a reference count minus one, and thus did not release the memory, resulting in the kernel Leak.



Although the product KHM is open source, but in order to suspicion, in the blog, and did not explain in detail the KHM workflow, perhaps the third part of the description can be more clear. However, there are problems or errors, but also welcome the Bo friends to propose and discuss. ^_^









Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.