About locating the VRRPD process with the Strace tool sometimes hangs up dead bug

Source: Internet
Author: User

Only for the work of summary memo.

Burning image, a little summary of the problem encountered in the bug.

A project to use the NAT,VRRP function of the L3 switch, but the field test accidentally appeared in the case of write file death, but not necessarily. Delivery is imminent, so add debugging information, repeatedly run the configuration of the script to locate the bug.

First, the beginning of the suspected Vtysh and VRRPD process communication congestion (the phenomenon is the system hangs dead).

(1) Because when running the configuration script, the Enable command also hangs dead, so suspicion;

(2) in the VRRPD and Vtysh command transmission of the key point of printing information (note VRRPD do not-D daemon, with & to the background), the tracking result is vtysh indeed the ' Enable ' command sent out, only the VRRPD process does not receive. Command communication Framework common, basic can be determined no problem, thus, the bug convergence to the VRRPD process of the problem;

(3) in the Vtysh process in the CTRL + C signal processing added to the Linux shell code, when the VRRPD cause the system hangs dead (phenomenon is blocking), you can enter the shell to Top,top discovery, VRRPD process CPU utilization is high, and sys:92.0%, So it is determined that the VRRPD process used a system call, and the system called the interface blocking, resulting in VRRPD blocking, resulting in the VRRPD process does not receive the Vtysh process to send more commands.

(4) next to determine which system call caused VRRPD blocking, then use the strace to pull up the VRRPD process.

About the usage of strace, online a lot of links: Click to open the link, and attachment one to the end of the article, in case the original deleted.

Used: Strace-aef-p/usr/sbin/vrrpd-o/tmp/vrrpdstrace.log

When the replication VRRPD process is blocked, look at the Vrrpdstrace.log file, analyze the various system calls inside, determine that the process is blocked on the IOCTL system call, according to the parameters of the IOCTL, trace to the kernel, and finally locate the SDK interface to delete Mac table entries. (The interface before and after printing)

Given that it's too hard to keep track, the bug is over ...

Ii. tracking the Broadcom SDK interface delete Mac table entries

(1) The process is tortuous and painful, do not know where to add a print, and then brush the screen, but it has been clear that there is a kernel processing function has not released a lock caused by the blocking, followed by the process of follow-up lock, because the interface with the bottom of the lock (not clear what lock) is not There is a semid keyword like the semaphore of the user state. But finally found a differentiate MEM macro as the key word, called L2XM,

The brush screen is a little bit less. It then looks for a lock in the kernel that uses L2XM, and finally runs the script repeatedly, confirming that the lock may be used in the _soc_l2x_thread thread as well.

(2) Finally, the blocking of other locks between lock (L2XM) and Unlock (L2XM) in _soc_l2x_thread is determined, causing the lock to not

There is no implementation, and there is no place to release the lock, the problem is basically over.

Third, the kernel Kmalloc memory type, may cause the kernel panic.

(1) In the kernel receive message bnet_rx_deferred in the message processing request memory Kmalloc, but the type is gfp_kernel, coincidence is that the processing of memory leaks in the message, resulting in a long time (or deliberately hit the message), will be out of memory, A lack of memory will result in the Kmalloc application is unsuccessful, that is, will not immediately apply to the memory, the kernel will be in the application of the time to switch it out, because the Gfp_kernel type is sleep, non-atomic. Therefore, the problem comes, the receiving message interrupt processing Kmalloc switch out, and receive message interrupt processing can no longer switch back. That is, the interrupt Receive message processing is considered to be atomic by the kernel, and when there are Kmalloc (Gfp_kernel) that are not atoms, it will panic.

(2) At the same time, the memory leaks and gfp_kernel two problems are found.

(3) about the type of kmalloc issues, online also a lot of links: Click to open the link, and Appendix II at the end of the article.

Complete.

Annex I: Use of strace.

Strace command Explanation
The Strace command is a powerful tool that can display all system calls made by a user-space program.
Strace Displays the parameters of these calls and returns the values in the form of symbols. Strace receives information from the kernel and does not need to build the kernel in any particular way.
Here are a few common option records.
The 1-f-F option tells Strace to track both the fork and the vfork out of the process
2-o xxx.txt output to a file.
3-e Execve only records execve such system calls
---------------------------------------------------
Process does not start, software runs suddenly slow, the program's "Segmentfault" and so on is to make every UNIX system users headache problem,
This article demonstrates how to use the three common debugging tools of truss, strace, and ltrace to quickly diagnose the "incurable diseases" of software using three practical cases.
  
  
Truss and Strace are used to track the situation of a process's system call or signal generation, and Ltrace is used to track the process call library functions. Truss is an early debug program developed for System V R4, and most Unix systems, including AIX and FreeBSD, have this tool in their own.
Strace was originally written for the SunOS system, and Ltrace first appeared in Gnu/debianlinux.
Both tools have now been ported to most UNIX systems, and most Linux distributions have their own strace and ltrace, and FreeBSD can install them through ports.
  
Not only can you debug a newly started program from the command line, you can also bind truss, strace, or Ltrace to an existing PID to debug a running program. The basic use of three debugging tools is basically the same, the following is only three common, and is the most commonly used three command line parameters:
  
-F: Tracks its child processes in addition to tracking the current process.
-o File: Writes output information to the file, rather than to the standard error output (stderr).
-P PID: binds to a running process that is corresponding to a PID. This parameter is commonly used to debug background processes.
  
Most of the debugging tasks can be done using the three parameters above, and here are a few command-line examples:
Truss-o Ls.truss Ls-al: Trace the run of the Ls-al and write the output information to the file/tmp/ls.truss.
Strace-f-O vim.strace vim: Tracks the operation of Vim and its sub-processes and writes output information to the file vim.strace.
Ltrace-p 234: Track a process that is already running with a PID of 234.
  
The output results of the three debug tools are similar in format, taking Strace as an example:
  
BRK (0) = 0x8062aa8
BRK (0x8063000) = 0x8063000
MMAP2 (NULL, 4096, Prot_read, Map_private, 3, 0x92f) = 0x40016000
  
Each row is a system call, the left side of the equals sign is the function name of the system call and its arguments, and to the right is the return value of the call. Truss, Strace, and ltrace work in a similar sense, using Ptrace system calls to track the process of debugging running, the detailed principle is not within the scope of this article, interested in reference to their source code.
Two examples show how to use these three debugging tools to diagnose the "incurable diseases" of the Software:
  
Case one: Segment fault error occurred running Clint
  
Operating system: Freebsd-5.2.1-release
Clint is a C + + static source code analysis tool that runs after ports is installed:
  
# clint Foo.cpp
Segmentation fault (core dumped)
Encountering "segmentation Fault" in Unix systems is just as annoying as popping the "Illegal actions" dialog box in MS Windows. OK, we use Truss to give Clint "pulse":
  
# truss-f-O clint.truss Clint
Segmentation fault (core dumped)
# tail Clint.truss
739:read (0x6,0x806f000,0x1000) = 4096 (0x1000)
739:fstat (6,0xbfbfe4d0) = 0 (0x0)
739:fcntl (0x6,0x3,0x0) = 4 (0x4)
739:fcntl (0x6,0x4,0x0) = 0 (0x0)
739:close (6) = 0 (0x0)
739:stat ("/root/.clint/plugins", 0xbfbfe680) err#2 ' No such file or directory '
SIGNAL 11
SIGNAL 11
Process stopped because of:16
Process exit, Rval = 139
We use truss to track the execution of Clint system calls, output the results to a file Clint.truss, and then use tail to view the last few rows.
Note that the last system call performed by Clint (line fifth): Stat ("/root/.clint/plugins", 0xbfbfe680) err#2 ' No such file or directory ', The problem is here: Clint cannot find the directory "/root/.clint/plugins", causing a segment error. How to solve? Very simple: mkdir-p/root/.clint/plugins, but this run Clint will still "segmentation Fault" 9. Continue to use truss tracking, found that Clint also need this directory "/root/.clint/plugins/python", after the establishment of this directory Clint finally able to run normally.
  
Case two: Vim startup speed significantly slower
  
Operating system: Freebsd-5.2.1-release
Vim version 6.2.154, after running vim from the command line, wait for nearly half a minute to enter the editing interface without any error output. Check it out. VIMRC and all Vim scripts are not misconfigured, and there is no solution to similar problems on the Internet, it is hard to hacking source code? There is no need to use truss to find out where the problem lies:
  
# truss-f-d-o Vim.truss vim
  
The function of the-d parameter here is to add a relative timestamp before each line of output, that is, the time spent on each system call executed. As long as we are concerned about which system calls take longer, we look at the output file Vim.truss with less and quickly find the doubt:
  
735:0.000021511 socket (0x2,0x1,0x0) = 4 (0x4)
735:0.000014248 setsockopt (0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0)
735:0.000013688 setsockopt (0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0)
735:0.000203657 Connect (0x4,{af_inet 10.57.18.27:6000},16) err#61 ' Connection refused '
735:0.000017042 Close (4) = 0 (0x0)
735:1.009366553 nanosleep (0xbfbfe468,0xbfbfe460) = 0 (0x0)
735:0.000019556 socket (0x2,0x1,0x0) = 4 (0x4)
735:0.000013409 setsockopt (0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0)
735:0.000013130 setsockopt (0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0)
735:0.000272102 Connect (0x4,{af_inet 10.57.18.27:6000},16) err#61 ' Connection refused '
735:0.000015924 Close (4) = 0 (0x0)
735:1.009338338 nanosleep (0xbfbfe468,0xbfbfe460) = 0 (0x0)
  
Vim tries to connect to the 6000 port of this host (connect () on line fourth), and after the connection fails, sleep for one second continues to retry (line 6th of Nanosleep () 10.57.18.27). The above fragment cycle appears more than 10 times, each time takes more than a second time, this is the reason that vim is obviously slow. However, you will certainly wonder: "How can vim connect to the other computer's 6000 port for no reason?" "。 That's a good question, so please think about what is the port of the 6000 service? Yes, that's X Server. It seems that vim is going to direct the output to a remote X Server, then the shell definitely defines the DISPLAY variable to see. CSHRC, there is such a line: setenv DISPLAY ${remotehost}:0, comment it out, and then re-login, The problem is solved.
  
  
Case three: Using debugging tools to understand how the software works
  
Operating system: Red Hat Linux 9.0
Using debugging tools to track the operation of software in real-time is not only an effective means of diagnosing software "incurable diseases", but also helps us to clarify the "vein" of software, that is, to quickly master the running process and working principle of software, which is a kind of auxiliary method of learning source code. The following example shows how to use Strace to "trigger inspiration" by tracking other software to solve the problems in software development.
As you know, opening a file within a process has a unique file descriptor (Fd:file descriptor) that corresponds to this file. And I encountered such a problem in the development of a software process:
If a FD is known, how can I get the full path of the file corresponding to this FD? Whether it's Linux, FreeBSD, or any other Unix system that doesn't provide such an API, what do you do? Let's think in a different way: Is there any software under Unix that can get the files that the process is opening? If you have enough experience, it is easy to think of lsof, which can be used to know which files the process is opening, and which process the file is opened by. OK, let's experiment with a little program lsof to see how it gets the files that the process opened. Lsof: Displays the files that the process opened.
  
/* TESTLSOF.C */
#include #include #include #include #include
int main (void)
{
Open ("/tmp/foo", o_creat| O_RDONLY); /* Open File/tmp/foo */
Sleep (1200); /* Sleep for 1200 seconds for further action * *
return 0;
}
  
The testlsof is placed in the background and its PID is 3125. Command Lsof-p 3125 to see which files are open for process 3125, we use Strace to track lsof runs, and the output is saved in Lsof.strace:
  
# gcc Testlsof.c-o testlsof
#./testlsof &
[1] 3125
# strace-o Lsof.strace lsof-p 3125
  
We searched the output file lsof.strace with "/tmp/foo" as the keyword, with only one result:
  
  
# grep '/tmp/foo ' lsof.strace
Readlink ("/proc/3125/fd/3", "/tmp/foo", 4096) = 8
  
The original lsof ingenious use of the/proc/nnnn/fd/directory (nnnn PID): The Linux kernel for each process in the/proc/to establish a directory with its PID name to save the process of information, and its subdirectory FD holds all the files opened by the process of FD. The target is close to us. OK, let's go to/proc/3125/fd/to see:
  
# cd/proc/3125/fd/
# ls-l
Total 0
lrwx------1 root root 5 09:50 0-/dev/pts/0
lrwx------1 root root 5 09:50 1-/dev/pts/0
lrwx------1 root root 5 09:50 2-/dev/pts/0
Lr-x------1 root root 5 09:50 3-/tmp/foo
# READLINK/PROC/3125/FD/3
/tmp/foo
  
The answer is obvious: each fd file in the/proc/nnnn/fd/directory is a symbolic link that points to a file opened by the process. We just need to use the Readlink () system call to obtain a corresponding file of FD, the code is as follows:
  
  
#include #include #include #include #include #include
int get_pathname_from_fd (int fd, char pathname[], int n)
{
Char buf[1024];
pid_t pid;
Bzero (BUF, 1024);
PID = Getpid ();
snprintf (buf, 1024x768, "/proc/%i/fd/%i", PID, FD);
Return Readlink (buf, pathname, N);
}
int main (void)
{
int FD;
Char pathname[4096];
Bzero (pathname, 4096);
FD = open ("/tmp/foo", o_creat| O_RDONLY);
GET_PATHNAME_FROM_FD (FD, pathname, 4096);
printf ("fd=%d; Pathname=%sn ", FD, pathname);
return 0;
}
  
For security reasons, the system does not automatically load the proc file system by default after FreeBSD 5, so to use truss or strace trackers, you must manually load the proc file system: Mount-t PROCFS Proc/proc; Add a line to the Etc/fstab:
  
Proc/proc procfs RW 0 0

Appendix II: Type Usage of Kmalloc

malloc memory allocations are similar to malloc, and unless blocked, he executes very quickly and does not get space zeroed.

Flags parameter

#include <linux/slab.h>

Void *kmalloc (size_t size, int flags);

The first parameter is the size of the block to allocate, the second parameter is the allocation flag (flags), and he provides a variety of kmalloc behavior.

The most commonly used Gfp_kernel, he represents memory allocations (which eventually always call Get_free_pages to implement the actual allocation, which is, that is, the origin of the GFP prefix) is performed on behalf of the process running in kernel space. Use Gfp_kernel allow Kmalloc to allocate idle memory if memory is insufficient allow the current process to sleep to wait. Therefore, the allocation function must be reentrant. If the current process should not sleep outside of the process context such as interrupt handlers, Tasklet, and kernel timers, the driver should use gfp_atomic.

Gfp_atomic

Used to allocate memory from the interrupt processing and other code outside the process context. Never sleeps.

Gfp_kernel

Normal allocation of kernel memory. May sleep.

Gfp_user

Used to allocate memory for user space pages; It may sleep.

Gfp_highuser

Like Gfp_user, but from high-end memory allocations, if any. The next subsection description is present in the high-end.

Gfp_noio

Gfp_nofs

This flag functions like gfp_kernel, but they increase the limit to what the kernel can do to satisfy requests. A gfp_nofs assignment does not allow any file system calls, and Gfp_noio does not allow any I/O initialization at all. They are primarily used in file systems and virtual memory codes, where a sleep allocation is allowed, but recursive file system calls can be a bad note.

These allocation flags listed above can be as parameters of the following flags, and these flags change how these allocations proceed:

__gfp_dma

This flag is required to be allocated in a memory area capable of DMA. The exact meaning is platform-dependent and is explained in the following chapters.

__gfp_highmem

This flag indicates that the allocated memory can be located in high-end memory.

__gfp_cold

Normally, the memory allocator tries to return to the "Buffered Hot" page-pages that may be found in the processor buffer. Instead, this flag requests a "cold" page, which has not been used for some time. It is useful for allocating pages for DMA reading, and it is useless to appear in the processor buffers. A complete discussion of how to allocate the DMA cache see the section "Direct Memory Access" in chapter 1th.

__gfp_nowarn

This rarely used flag prevents the kernel from issuing a warning (using PRINTK) when an assignment is not met.

__gfp_high

This flag identifies a high-priority request, which is allowed to consume even the last memory pages that the kernel retains to the emergency state.

__gfp_repeat

__gfp_nofail

__gfp_noretry

These flags modify how the allocator moves when it has difficulty satisfying an assignment. __gfp_repeat means "try harder" by repeating the attempt – but the allocation may still fail. The __gfp_nofail flag tells the allocator not to fail; It tries its best to meet the requirements. The use of __gfp_nofail is strongly not recommended; There may never be a valid reason to use it in a device driver. Finally, __gfp_noretry tells the allocator to discard the requested memory immediately if it is not.

2. Memory section

The use of __GFP_DMA and __gfp_highmem is platform-dependent, and Linux divides memory into 3 segments: Memory for DMA, conventional memory, and high-end memory. The ISA device DMA segment on the X86 platform is the top 16MB of memory, and the PCI device has no such limit.

The mechanism behind the memory area is implemented in MM/PAGE_ALLOC.C, and the memory area is initialized in the platform-specific file, often in the Arch directory tree of Mm/init.c.

Linux handles memory allocations by creating a set of fixed-size memory object pools. The allocation request is handled in such a way that it enters a pool that holds large enough objects and submits the entire block of memory to the requestor. One thing that drives developers to remember is that the kernel can only allocate certain predefined, fixed-size byte arrays.

If you request an arbitrary amount of memory, you may get slightly more than you requested, at most twice times the number. Similarly, programmers should remember that the minimum allocation that KMALLOC can handle is 32 or 64 bytes, and the size of the page used by system-dependent systems. The size of the memory block that the Kmalloc can allocate has an upper limit. This limitation varies with the system and kernel configuration options. If your code is to be fully portable, it cannot be expected to allocate any more than KB. If you need more than a few kilobytes, however, there is a better way than kmalloc to get the memory

Dynamically open memory in the device driver or kernel module, not with malloc, but Kmalloc, vmalloc, or apply the page directly with Get_free_pages. The memory is freed with Kfree,vfree, or free_pages. The Kmalloc function returns a virtual address (a linear address). The special thing about Kmalloc is that it allocates memory that is physically contiguous, which is important for devices to be DMA. The memory allocated with Vmalloc is only a linear address continuous, the physical address is not necessarily continuous, can not be directly used for DMA.

Note that the Kmalloc Max can only open up 128k-16,16 bytes is occupied by the page descriptor structure. Kmalloc usage See KHG.

Memory-mapped I/O ports, registers, or RAM for hardware devices (such as video memory) typically occupy more than F0000000 of address space. It is not directly accessible in the driver, it is vremap to obtain the address after re-mapping through the kernel function.

In addition, many hardware require a larger contiguous memory to be used as DMA transfer. This memory needs to reside in memory and cannot be swapped into a file. But Kmalloc can only open up to the size of 32xpage_size memory, the general page_size=4kb, that is, the size of 128kB of memory.

About locating the VRRPD process with the Strace tool sometimes hangs up dead bug

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.