Transferred from: https://www.ibm.com/developerworks/cn/linux/l-task-killable/index.html
New sleep state allows task_uninterruptible to respond to deadly signals
Linux®kernel 2.6.25 introduces a new process state called TASK_KILLABLE
, which is used to set the process to sleep, which can replace a process state that is valid but may not be terminated TASK_UNINTERRUPTIBLE
, and a process state that is easy to wake up but is more secure TASK_INTERRUPTIBLE
. In 2002, the OpenAFS file system driver encountered a problem waiting for an event to be interrupted after blocking all signals, and that was why TASK_KILLABLE
it was launched. This new sleep state allows the TASK_UNINTERRUPTIBLE
response to a deadly signal. In this article, the author describes this and combines the examples from 2.6.26 and earlier versions of 2.6.18 to discuss the changes that have occurred in the Linux kernel and the new APIs that these changes bring.
0 Reviews
Avinesh Kumar, System software engineer, EMC
October 20, 2008
Develop and deploy your next application on the IBM Bluemix cloud platform.
Get started with your trial
Like a file, a process is the basic element of any UNIX® operating system. A process is a dynamic entity that executes instructions for an executable file. In addition to executing its directives, processes sometimes manage open files, processor contexts, address spaces, and program-related data. The Linux kernel keeps complete information about the process in the process descriptor , and its structure is defined as struct task_struct
. You can see the various fields in the Linux kernel source file Include/linux/sched.h struct task_struct
.
About process Status
During the life cycle of a process, you may experience a series of mutually exclusive states. The kernel saves the status information for the process in struct task_struct
the State field. Figure 1 shows the transitions between process states.
Figure 1. Process state transitions
Let's take a look at the various process statuses:
TASK_RUNNING
: The process is currently running or is waiting for dispatch in the running queue.
TASK_INTERRUPTIBLE
: The process is asleep and is waiting for certain events to occur. The process can be interrupted by a signal. After a signal is received or awakened by an explicit wake-up call, the process transitions to a TASK_RUNNING
state.
TASK_UNINTERRUPTIBLE
: This process state is similar to TASK_INTERRUPTIBLE
, except that it does not process the signal. It is inappropriate to interrupt a process in this state because it may be completing some important tasks. When the event it waits for occurs, the process is awakened by an explicit wake-up call.
TASK_STOPPED
: The process has aborted execution, it is not running, and cannot be run. When receiving SIGSTOP
and SIGTSTP
waiting for a signal, the process enters this state. Once the SIGCONT
signal is received, the process becomes operational again.
TASK_TRACED
: The process will enter this state when it is being monitored by other processes, such as the Debug program.
EXIT_ZOMBIE
: The process has been terminated, and it is waiting for its parent process to collect some statistics about it.
EXIT_DEAD
: The final state (as its name). When a process is removed from the system, it enters this state because all statistics have been collected by its parent process through wait4()
or waitpid()
called.
For more information about process state transitions, see the UNIX operating system design in the Resources section.
As mentioned earlier, the process state TASK_UNINTERRUPTIBLE
and TASK_INTERRUPTIBLE
all are sleep states. Now, let's see how the kernel can put the process to sleep.
Back to top of page
Kernel mapping
The Linux kernel provides two ways to put a process into sleep state.
The normal way to put a process into sleep is to set the process state to TASK_INTERRUPTIBLE
or TASK_UNINTERRUPTIBLE
call the scheduler's schedule()
function. This removes the process from the CPU run queue. If the process is in a sleep state in interruptible mode (by setting its state to TASK_INTERRUPTIBLE
), it can be awakened by an explicit wake-up call ( wakeup_process()
) or a signal that needs to be processed.
However, if the process is in non-interruptible mode of sleep (by setting its state to TASK_UNINTERRUPTIBLE
), it can only be awakened by an explicit wake-up call. Unless it is a last resort, we recommend that you set the process to interruptible sleep mode instead of non-disruptive sleep mode (for example, during device I/O, when the signal is difficult to process).
When a task in the Interruptible sleep mode receives a signal, it needs to process the signal (unless it has been shown), leaving the task it was working on (where the code needs to be cleared) and -EINTR
returning it to the user space. Again, the work of checking these return codes and taking appropriate action will be done by the programmer. As a result, lazy programmers may prefer to put the process into a sleep state in non-interruptible mode because the signal does not wake up such tasks. One thing to note, however, is that wake-up calls to non-interruptible processes may not occur for some reason, which can cause the process to fail and eventually cause problems because the only workaround is to restart the system. On the one hand, you need to consider some details, because not doing this will introduce bugs on the kernel side and the client side. On the other hand, you may generate processes that are never stopped (processes that are blocked and cannot be terminated).
Now, we've implemented a new sleep method in the kernel!
Back to top of page
New Sleep Status: task_killable
Linux Kernel 2.6.25 introduces a new process sleep state TASK_KILLABLE
: When a process is in this new sleep state that can be terminated, it works similarly TASK_UNINTERRUPTIBLE
, but can respond to a deadly signal. Listing 1 shows a comparison between the kernel 2.6.18 and the kernel 2.6.26 process state (defined in include/linux/sched.h ):
Listing 1. Comparison between the 2.6.18 and 2.6.26 process states
linux Kernel 2.6.18 Linux Kernel 2.6.26================================= =================================== #define TASK_RUNNING 0 #define Task_running 0#define task_in Terruptible 1 #define task_interruptible 1#define task_uninterruptible 2 #define Task_uninterruptib LE 2#define task_stopped 4 #define __task_stopped 4#define task_traced 8 #def Ine __task_traced 8/* in Tsk->exit_state */* in Tsk->exit_state */#define Exit_zombie #define Exit_zombie 16#define exit_dead #define Exit_dead 32/* In Tsk->state again */* in tsk->state again */#define TASK_NONINTERACTIVE #define Task_dead #define Task_wakekill
Note that the status TASK_INTERRUPTIBLE
and is TASK_UNINTERRUPTIBLE
not modified. TASK_WAKEKILL
used to wake the process when a fatal signal is received.
Listing 2 shows the status TASK_STOPPED
and TASK_TRACED
the modifications (as well as TASK_KILLABLE
the definitions):
Listing 2. New state definitions in kernel 2.6.26
#define Task_killable (Task_wakekill | task_uninterruptible) #define task_stopped (Task_wakekill | __task_stopped) #define TASK_TRACED (task_ Wakekill | __task_traced)
In other words, TASK_UNINTERRUPTIBLE
+ TASK_WAKEKILL
= TASK_KILLABLE
.
Back to top of page
Using Task_killable's new kernel API about
CompleteSome of the information
The completion mechanism applies when you want to put a task to sleep, but then you need to wake it up when certain events are complete. It provides a simple, non-race condition synchronization mechanism. The routine wait_for_completion(struct completion *comp)
will cause the calling task to be in a non-interruptible state of sleep unless completion has occurred. It requires a pass complete(struct completion *comp)
or complete_all(struct completion *comp)
function to wake the process.
Besides wait_for_completion_killable()
, other waiting routines include:
wait_for_completion_timeout()
wait_for_completion_interruptible()
wait_for_completion_interruptible_timeout()
For a definition of completion structure, see Include/linux/completion.h.
Let's take a look at the new functions in this new state.
int wait_event_killable(wait_queue_t queue, condition);
The function is defined in include/linux/wait.h, and it places the calling process in the terminating sleep state queue
until it is condition
equal to true.
long schedule_timeout_killable(signed long timeout);
The function is defined in KERNEL/TIMER.C, which is used primarily to set the state of the current task to TASK_KILLABLE
and called schedule_timeout()
, which is used to let the calling task sleep timeout
a jiffies. (In UNIX systems,Jiffy mainly represents the time between two sequential clock units.)
int wait_for_completion_killable(struct completion *comp);
This routine is defined in KERNEL/SCHED.C, which is used to wait for a process to become a terminating state after the event has completed. If there is no waiting for a fatal signal, the function will call to schedule_timeout()
maintain MAX_SCHEDULE_TIMEOUT
(specified as equals LONG_MAX
) a jiffies time.
int mutex_lock_killable(struct mutex *lock);
Defined in kernel/mutex.c, this routine is used to obtain a mutex. However, if the lock is not available and the task is waiting to acquire a lock while receiving a fatal signal, the task is removed from the list of waiting persons waiting for the mutex to process the signal.
int down_killable(struct semaphore *sem);
Defined in kernel/semaphore.c, which is used to get the semaphore sem
. If the semaphore is not available, it will be put to sleep, and if a fatal signal is passed to it, it will be removed from the list of waiting persons and the signal needs to be responded to. The other two ways to get the semaphore are to use routines down()
or down_interruptible()
. down()
function is now not recommended for use. You should use the down_killable()
or down_interruptible()
function.
Back to top of page
Changes in client code for NFS
The NFS client code also uses this new process state. Listing 3 shows the differences in macros between the Linux kernel 2.6.18 and 2.6.26 nfs_wait_event
.
Listing 3. Nfs_wait_event changes due to task_killable
Linux Kernel 2.6.18 linux Kernel 2.6.26========================================== ======================== ===================== #define Nfs_wait_event (CLNT, Wq, condition) #define Nfs_wait_event (CLNT, Wq, condition) ({ ({ int __retval = 0; int __retval = wait_event_killable (Wq, condition); if (clnt->cl_intr) { __retval; sigset_t oldmask; }) Rpc_clnt_sigmask (CLNT, &oldmask); __retval = wait_event_interruptible (Wq, condition); Rpc_clnt_sigunmask (CLNT, &oldmask); } else wait_event (Wq, condition); __retval; })
Listing 4 shows the nfs_direct_wait()
definition of the function in Linux kernels 2.6.18 and 2.6.26
Listing 4. Nfs_direct_wait () changes due to task_killable
Linux Kernel 2.6.18 ================================= static ssize_t Nfs_direc t_wait (struct nfs_direct_req *dreq) {ssize_t result =-EIOCBQ ueued; /* Async requests don ' t wait here * * (DREQ->IOCB) Goto out; result = Wait_for_completion_interruptible (&dreq->completion); if (!result) result = dreq->error; if (!result) result = dreq->count; Out:kref_put (&dreq->kref, Nfs_direct_r Eq_release); return (ssize_t) result; Linux Kernel 2.6.26=====================static ssize_t nfs_direct_wait (struct nfs_direct_req *dreq) {ssize_t result =-eiocbqueued; /* Async Requests T wait here */if (DREQ->IOCB) goto out; result = Wait_for_completion_killable (&dreq->completion); if (!result) result = dreq->error; if (!result) result = Dreq->count;out:return (ssize_t) result; }
See the Linux Kernel mailing list entry in the Resources section to learn more about the changes in the NFS client so that you can better master this new feature.
Early NFS Mount options intr
can help resolve problems with NFS client processes and wait for certain events, but it allows all interrupts, not just through fatal signals (such as TASK_KILLABLE
).
Back to top of page
Conclusion
Although this feature is an improvement on existing options-after all, it is another way to resolve the dead process-but it will be a while before it is universally applied. Remember, unless it is really necessary to suppress any interruption of explicit wake-up calls (through traditional TASK_UNINTERRUPTIBLE
), use the new one TASK_KILLABLE
.
Reference Learning
- You can refer to the original English text on the DeveloperWorks global site in this article.
TASK_KILLABLE
The status of the process was born out of a question raised by David Howells in 2002; He found that the OpenAFS file system driver waits for an event when it blocks all interruptible signals, and they should actually wait in the TASK_UNINTERRUPTIBLE
state.
- Jonathan Corbet's
TASK_KILLABLE
discussion on (lwn.net,2008 July) is a very useful introductory information.
- "Sleep in Kernel korner:kernel" (Linux journal,2005 year July) explains the use of sleep in the Linux kernel.
- In the UNIX operating system design (Prentice Hall,1986,maurice J. Bach), the 6th Chapter details details about process state transitions.
- The "NFS killable tasks request comments on patch" entry in the Linux Kernel mailing list demonstrates more changes in the use of new 2.6.26 functions by NFS clients
TASK_KILLABLE
.
- Read the "Linux Kernel Anatomy" (developerworks,2007 June) for an overview of how the kernel is composed.
- More resources for Linux developers can be found in the DeveloperWorks Linux zone. Browse our most popular articles and tutorials.
- Check out all Linux tips and Linux tutorials on the developerWorks.
- Stay tuned for DeveloperWorks technical activities and webcasts.
Access to products and technologies
- Build your next Linux development project with IBM trial software that you can download directly from DeveloperWorks.
Discuss
- Join the DeveloperWorks community through blogs, forums, podcasts and spaces.
New process State "go" in Task_killable:linux