One process hang to analyze the problem..., hang ..

Source: Internet
Author: User

One process hang to analyze the problem..., hang ..

During the past two days, some people used the data verification tool and found that the hang process had crashed. I did not know why. I simply looked at the process stack. Although the problem was very simple, it could lead to the hang process, it must not be a small problem. Briefly describe the structure of the program components. The program consists of two parts: dbchk and dbchk_inner. dbchk is implemented using python code, and dbchk_inner is implemented using C language. Dbchk is responsible for concurrency control, while dbchk_inner is responsible for specific verification tasks. You can verify the data by running the dbchk command. The process relationship is as follows:

$ Pstree 18649

Dbchk-mongo── sh ── dbchk_inner ── 2 * [{scandiff}]

─ ── {Drcchk}

Back to the problem itself, I used the test case to reproduce the hang scenario and checked the stack information of dbchk and dbchk_inner. The information is as follows:

Dbchk process 18649 stack information:

$ Pstack 18649

Thread 2 (Thread 0x7f4343fff700 (LWP 18658 )):

#0 0x000000425f80f09d in waitpid () from/lib64/libpthread. so.0

#1 0x00000000000000ff8a in ?? () From/usr/lib64/libpython2.6.so. 1.0

#2 0x000000000018de706 in PyEval_EvalFrameEx () from/usr/lib64/libpython2.6.so. 1.0

#3 0x000000000018e0797 in PyEval_EvalCodeEx () from/usr/lib64/libpython2.6.so. 1.0

 

Dbchk_inner process 18660 stack information:

Pstack 1, 18660

#0 0x0000000000f4da3dd in write () from/lib64/libc. so.6

#1 0x0000000000f470fd3 in _ IO_new_file_write () from/lib64/libc. so.6

#2 0x0000000000f470e9a in _ IO_new_file_xsputn () from/lib64/libc. so.6

#3 0x0000000000f46705d in fwrite () from/lib64/libc. so.6

#4 0x000000000000004136f0 in seconds: run (unsigned int )()

We can see that the parent process dbchk is stuck in the waitpid () function. This is easy to understand. It should wait for the child process dbchk_inner to end. Then, let's look at the child process dbchk_inner, and dbchk_inner is stuck in the fwrite, this is a bit strange. Why is writing blocked? The first thing that comes to mind is that the disk space is insufficient? After reading the disk space, there is still a lot of space left. What can cause the write to get stuck? Another possibility is that the buffer is full and cannot be written.

Based on this thinking, let's look back at the dbchk code.

Pio = subprocess. Popen (command, shell = True, stdout = subprocess. PIPE, stderr = subprocess. PIPE). wait ()

You can see that the program uses the wait function of Popen, which can explain why the parent process gets stuck, because the sub-process is not finished. Pay attention to the parameter of Popen, redirects stdout and stderr output to subprocess. PIPE, which indicates the pipeline between parent and child processes. The write buffer of the sub-process is stuck because the buffer of PIPE is full. Why is it full? First, too much data is generated. On the other hand, no process can fetch data from the buffer, resulting in no buffer. The default PIPE buffer size is 4096 bytes. This value can be obtained through ulimit-a, 8*512 = 4096 bytes, and this value cannot be modified, because the value is defined in the linux header file, unless you re-compile the linux kernel.

$ Ulimit-

Core file size (blocks,-c) 0

......

Pipe size (512 bytes,-p) 8

 

Okay, the problem is found. The PIPE buffer zone is full of sins. How can this problem be solved?

1. Do not redirect the stdout and stderr pipelines to directly output

2. The program controls the data size output to the MPs queue.

Pipelines are widely used in inter-process communication (IPC), and shell commands are widely used. For example:

Ps-aux | grep mysqld

The preceding command gets information about the mysqld process. Here, the ps and grep commands use pipelines for communication. Pipelines have the following features:

1. the MPs queue is half-duplex and data can only flow in one direction. The ps command output is the grep output.

2. It can only be used for communication between parent and child processes or brothers. Here we can think that the ps and grep commands are both sub-processes of shell (bash/pdksh/ash/dash) commands, and the two are brothers.

3. the MPs queue is a file relative to the processes at both ends of the MPs queue and only exists in the memory.

4. The write end keeps writing data to the MPs queue and writes data to the end of the MPs queue each time. The read end keeps reading data from the MPs queue and reads data from the header each time.

At this point, you may have a question: when will the process at both ends of the pipeline end when the writing process constantly writes and reads the process continuously? For example, the command was just over soon. How does it work? There are two basic principles for pipelines:

1. When a write end has closed the pipeline, when all data is read, read returns 0 to indicate that the end of the file is reached.

2. When writing a read-end closed pipeline, the sigpipe information is generated.

In combination with this example, when the ps write pipeline ends, it will be automatically closed. At this time, the grep process read will return 0 and then automatically end.

 

Reference:

Advanced Programming in UNIX environment

Http://blog.chinaunix.net/uid-26833883-id-3227144.html


How does UNIX use the command hang to hold a process?

Assume that the process number is pid.

Kill-SIGSTOP pid stops the process
Kill-SIGCONT pid continues to run the stopped process.

Find the person who will analyze the process

Q: There are many process port tools. What should I use IceSword?
Answer: 1. Most so-called process tools are written using Windows Toolhlp32, psapi, or ZwQuerySystemInformation System Call (the first two eventually use this call, you can easily kill any ApiHook, not to mention some kernel-level backdoors. A very small number of tools use the kernel Thread Scheduling structure to query processes. This solution requires hard coding, not only are different versions of the system different, but a patch may also need to be used to upgrade the program, and some people have also proposed methods to prevent such a search. The IceSword process search core State solution is unique at present. Considering the possible hidden means of kernel backdoors, we can detect all hidden processes.
2. Most tools use Toolhlp32 and psapi to find the process path. The former calls the RtlDebug *** function to inject a remote thread to the target, and the latter uses the debug api to read the target process memory, essentially, it is an enumeration of PEB. By modifying PEB, it is easy for these tools to find the north. The core State Solution of IceSword originally displays the full path, and is also displayed when it is cut to other paths during running.
3. The same is true for process dll modules and 2. Other tools using PEB will be easily spoofed, while IceSword will not be mistaken (a very small number of systems do not support it, and PEB enumeration is still used ).
4. The process of IceSword is powerful and convenient (of course there will be danger ). You can easily remove multiple selected processes. Of course, to be specific, there are three other reasons: idle process, System process, and csrss process. Other processes can be easily killed. Of course, some processes (such as winlogon) will crash after they are killed.
5. There are indeed many port tools on the Internet, but there are also many methods to hide ports on the Internet. These methods do not work for IceSword. In fact, I wanted to bring a firewall for dynamic search, but I didn't want to make it too bloated. The port here refers to the port to which the IPv4 Tcpip protocol stack of windows belongs. The third-party protocol stack or IPv6 stack is not listed here.
6. Let's talk about this first...

Q: What are the better features of IceSowrd, which is a powerful and convenient service tool provided by windows?
A: The interface is not very easy to use because it is relatively lazy. However, the service function of IceSword is mainly to view the trojan service, which is very convenient to use. For example, by the way, look for a kind of Trojan horse: svchost is the host of some shared process services, and some Trojans exist as dll. How can we find them by using svchost? First, look at the process column and find that there are too many svchost. remember their pid and go to the service column to find the service item corresponding to the pid, use the Registry to view its dll file path (from the name listed in the first column of the service item to the services sub-Key of the Registry to find the sub-Key of the corresponding name ), it is easy to find abnormal items based on whether it is the usual service item. The rest of the work is to stop the task or stop the process, delete the file, and restore the registry, of course, you need to have a general knowledge of the service.

Q: What kind of Trojan horse backdoor will hide the process registry file? How can I find it with IceSword?
A: For example, hxdef, which is popular recently and open-source (easy to produce variants), is such a backdoor. You can use some tools, such as ** experts, ** masters, and ** kxing, to see if they can see their processes, registration items, services, and directory files. It is very convenient to use IceSword. You can see the hxdef100 process in red in the process bar, and you can also see the service items in red in the service bar. By the way, you can find them in the Registry and file bar. If the trojan is being reversely connected, you can also see them in the port bar. To remove it, first obtain the full path of the backdoor program in the process bar, end the process, delete the backdoor directory, and delete the corresponding service items in the registry... here is just a brief introduction. Please learn how to use IceSword effectively on your own.

Q: What is "kernel module?
A: The PE module loaded into the system and the space is mainly the driver *. sys, which is generally... the remaining full text>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.