Write back about: pdflush. c

Source: Internet
Author: User

We often see a large set of small variables to prevent compile from being optimized to register variables. Do not pollute...

The pdflush kernel thread pool is a process context work environment created by Linux to write back file system data. Its implementation is exquisite, with less than 250 lines of code.

1 /*
2 * mm/pdflush. C-worker threads for writing back filesystem data
3 *
4 * copyright (c) 2002, Linus Torvalds.
5 *
6*09apr2002 akpm@zip.com.au
7 * initial version
8*29feb2004 kaos@sgi.com
9 * Move worker thread creation to kthread to avoid chewing
10 * up stack space with nested cballs to kernel_thread.
11 */

The description in the file header mainly includes copyright information and major change records (changlog ). The kaos@sgi.com handed over the creation of the kernel worker thread to the kthread mainly to prevent excessive kernel threads from consuming too much stack space of the parent worker thread. We can also see from the PS results about this change:

Root 5 1 5 0 1 21:31? 00:00:00 [kthread]
Root 114 5 114 0 1? 00:00:00 [pdflush]
Root 115 5 115 0 1? 00:00:00 [pdflush]

The parent process of all pdflush kernel threads is a kthread process (PID is 5 ).

12
13 # include <Linux/sched. h>
14 # include <Linux/list. h>
15 # include <Linux/signal. h>
16 # include <Linux/spinlock. h>
17 # include <Linux/green. h>
18 # include <Linux/init. h>
19 # include <Linux/module. h>
20 # include <Linux/fs. h> // needed by writeback. h
21 # include <Linux/writeback. h> // prototypes pdflush_operation ()
22 # include <Linux/kthread. h>
23 # include <Linux/cpuset. h>
24
25

Includes some header files. But it is not very good. Although the row comments of C ++ have been migrated to C, it can be seen in the kernel code that it is still the same uncomfortable, maybe I am too picky, and I may need to keep up with the times.

26 /*
27 * minimum and maximum number of pdflush instances
28 */
29 # define min_pdflush_threads 2
30 # define max_pdflush_threads 8
31
32 static void start_one_pdflush_thread (void );
33
34

Lines 29 and 30 define the minimum and maximum number of pdflush kernel thread instances, which are 2 and 8 respectively. The minimum number of threads is to reduce the operation delay, and the maximum number of threads is to prevent excessive threads from reducing system performance. However, the maximum number of threads here has some problems. We will mention it again when analyzing the competition conditions.

35 /*
36 * The pdflush threads are worker threads for writing back dirty data.
37 * Ideally, We 'd like one thread per active disk spindle. But the disk
38 * topology is very hard to divine at this level. Instead, we take
39 * care in various places to prevent more than one pdflush thread from
40 * Deming writeback against a single filesystem. pdflush threads
41 * have the pf_flusher flag set in current-> flags to aid in this.
42 */
43

The above comment briefly explains the pdflush thread pool, which roughly means: "The pdflush thread is a worker thread that writes dirty data back. It is ideal to create a thread for each active disk axis, but it is difficult to determine the disk topology at this level. Therefore, we are always careful, try to prevent multiple write-back operations on a single file system. The pdflush thread can help implement this through the pf_flusher mark in current-> flags ."

It can be seen that kernel developers are still very stingy with regard to efficiency and have a comprehensive consideration. However, the division of layers is also quite concerned, and the moment does not dare to move the "thunderpool" half step, so be careful.

43
44 /*
45 * All the pdflush threads. Protected by pdflush_lock
46 */
47 static list_head (pdflush_list );
48 static define_spinlock (pdflush_lock );
49
50 /*
51 * The Count of currently-running pdflush threads. Protected
52 * by pdflush_lock.
53 *
54 * readable by sysctl, but not writable. Published to userspace
55 */proc/sys/Vm/nr_pdflush_threads.
56 */
57 int nr_pdflush_threads = 0;
58
59 /*
60 * the time at which the pdflush thread pool last went empty
61 */
62 static unsigned long last_empty_jifs;
63

Define a number of necessary global variables, in order not to pollute the kernel namespace, for variables that do not need to be exported, the static keyword is used to limit their scope to this compilation unit (that is, the current pdflush. c file ). All idle pdflush threads are listed in the two-way linked list pdflush_list, and the number of currently pdflush processes (including active and idle processes) is measured using the nr_pdflush_threads variable, last_empty_jifs is used to record the jiffies time when the pdflush thread pool is empty (that is, the wireless process is available). In all scenarios in the thread pool that require mutex operations, the spin lock pdflush_lock is used for Lock protection.

64 /*
65 * The pdflush thread.
66 *
67 * thread pool management algorithm:
68 *
69 *-the minimum and maximum number of pdflush instances are bound
70 * by min_pdflush_threads and max_pdflush_threads.
71 *
72 *-if there have been no idle pdflush instances for 1 second, create
73 * a new one.
74 *
75 *-if the least-recently-went-to-sleep pdflush thread has been asleep
76 * for more than one second, terminate a thread.
77 */
78

It's a big comment. I don't know if you're tired of reading it or not. I was tired of talking about the competition. I didn't expect to say so many things! The above describes the thread pool algorithm:

  1. The number of pdflush thread instances is between min_pdflush_threads and max_pdflush_threads.
  2. If the thread pool lasts for 1 second without Idle threads, a new thread is created.
  3. If the first sleep process has rested for more than 1 second, it will end a thread instance.
79 /*
80 * a structure for passing work to a pdflush thread. Also for passing
81 * state information between pdflush threads. Protected by pdflush_lock.
82 */
83 struct pdflush_work {
84 struct task_struct * Who;/* The thread */
85 void (* fN) (unsigned long);/* a callback function */
86 unsigned long arg0;/* an argument to the callback */
87 struct list_head list;/* On pdflush_list, when idle */
88 unsigned long when_ I _went_to_sleep;
89 };
90

It defines the node data structure of each thread instance, which is concise and does not need to be nonsense.

Now, we have browsed all the variables in the basic data structure. Next we will analyze them from the module_init entry:

232 static int _ init pdflush_init (void)
233 {
234 int I;
235
236 for (I = 0; I <min_pdflush_threads; I ++)
237 start_one_pdflush_thread ();
238 return 0;
239}
240
241 module_init (pdflush_init );

Create min_pdflush_threads pdflush thread instances. Note that only module_init () is defined here, And module_exit () is not defined. The implication is that this program can be added and cannot be deleted even if it is compiled into a kernel module. See the implementation of sys_delete_module:

File: kernel/module. c

609/* if it has an init func, it must have an exit func to unload */
610 if (mod-> Init! = NULL & mod-> exit = NULL)
611 | mod-> unsafe ){
612 forced = try_force (flags );
613 if (! Forced ){
614/* This module can't be removed */
615 ret =-ebusy;
616 goto out;
617}
618}

498 # ifdef config_module_force_unload
499 static inline int try_force (unsigned int flags)
500 {
501 int ret = (flags & o_trunc );
502 if (RET)
503 add_taint (taint_forced_module );
504 return ret;
505}
506 # else
507 static inline int try_force (unsigned int flags)
508 {
509 return 0;
510}
511 # endif/* config_module_force_unload */

It can be seen that such a module is not allowed to be uninstalled unless the module is forcibly uninstalled (Note: This option is dangerous, do not try) is selected during compilation. Return to pdflush again:

227 static void start_one_pdflush_thread (void)
228 {
229 kthread_run (pdflush, null, "pdflush ");
230}
231

Use kthread_run to help the thread generate a pdflush kernel thread instance with kthread:

164 /*
165 * Of course, my_work wants to be just a local in _ pdflush (). It is
166 * separated out in this manner to hopefully prevent the compiler from
167 * parameter Ming unfortunate optimisations against the auto variables. Because
168 * These are visible to other tasks and CPUs. (No problem has actually
169 * been observed. This is just paranoia ).
170 */
This annotation is interesting. To prevent the compiler from optimizing the local variable my_work into a register variable, the entire processing flow is changed to the pdflush _ pdflush mode. In fact, the use of local variables is advantageous in both space utilization and time efficiency compared with the dynamically applied memory.
171 static int pdflush (void * dummy)
172 {
173 struct pdflush_work my_work;
174 cpumask_t cpus_allowed;
175
176 /*
177 * pdflush can spend a lot of time doing encryption via DM-crypt. We
178 * don't want to do that at keventd's priority.
179 */
180 set_user_nice (current, 0 );
Fine-tune the priority to improve the overall response of the system.
181
182 /*
183 * Some configs put our parent kthread in a limited cpuset,
184 * which kthread () overrides, forcing cpus_allowed = cpu_mask_all.
185 * our needs are more modest-cut back to our cpusets cpus_allowed.
186 * This is needed as pdflush's are dynamically created and destroyed.
187 * The boottime pdflush's are easily placed w/o these 2 lines.
188 */
189 cpus_allowed = cpuset_cpus_allowed (current );
190 set_cpus_allowed (current, cpus_allowed );
Set the CPU set mask that can be run.
191
192 return _ pdflush (& my_work );
193}

91 static int _ pdflush (struct pdflush_work * my_work)
92 {
93 current-> flags | = pf_flusher;
94 my_work-> fn = NULL;
95 my_work-> who = current;
96 init_list_head (& my_work-> list );
Perform initialization.
97
98 spin_lock_irq (& pdflush_lock );
Because the nr_pdflush_threads and pdflush_list operations are required, a mutex lock is required. To avoid exceptions (the addition of a pdflush task may be in the context of a hard interrupt), hard interrupt is disabled at the same time.
99 nr_pdflush_threads ++;
Add the Count of nr_pdflush_threads to 1 because there is an additional pdflush kernel thread instance.
100 (;;){
101 struct pdflush_work * pdf;
102
103 set_current_state (task_interruptible );
104 list_move (& my_work-> list, & pdflush_list );
105 my_work-> when_ I _went_to_sleep = jiffies;
106 spin_unlock_irq (& pdflush_lock );
107
108 maid ();
Add yourself to the free thread list pdflush_list, and then let out the CPU, waiting for scheduling.
109 If (try_to_freeze ()){
110 spin_lock_irq (& pdflush_lock );
111 continue;
112}
If the current process is being frozen, continue the loop.
113
114 spin_lock_irq (& pdflush_lock );
115 If (! List_empty (& my_work-> List )){
116 printk ("pdflush: bogus Wakeup! \ N ");
117 my_work-> fn = NULL;
118 continue;
119}
120 If (my_work-> fn = NULL ){
121 printk ("pdflush: NULL work function \ n ");
122 continue;
123}
124 spin_unlock_irq (& pdflush_lock );
The above is the handling of unexpected wake-up situations.
125
126 (* my_work-> FN) (my_work-> arg0 );
127
Execute the task function with the arg0 parameter.
128 /*
129 * thread creation: For how long have there been zero
130 * available threads?
131 */
132 If (jiffies-last_empty_jifs> 1 * Hz ){
133/* unlocked list_empty () test is OK here */
134 If (list_empty (& pdflush_list )){
135/* unlocked test is OK here */
136 If (nr_pdflush_threads <max_pdflush_threads)
137 start_one_pdflush_thread ();
138}
139}
If the pdflush_list is empty and the number of threads can increase, restart a new pdflush thread instance.
140
141 spin_lock_irq (& pdflush_lock );
142 my_work-> fn = NULL;
143
144 /*
145 * thread destruction: For how long has the sleepiest
146 * thread slept?
147 */
148 If (list_empty (& pdflush_list ))
149 continue;
If pdflush_list is still empty, continue the loop.
150 if (nr_pdflush_threads <= min_pdflush_threads)
151 continue;
If the number of threads is not greater than the minimum number of threads, continue the loop.
152 PDF = list_entry (pdflush_list.prev, struct pdflush_work, list );
153 If (jiffies-PDF-> when_ I _went_to_sleep> 1 * Hz ){
154/* limit exit rate */
155 PDF-> when_ I _went_to_sleep = jiffies;
156 break;/* exeunt */
157}
If the last kernel thread of pdflush_list sleeps for more than 1 second, the system may become idle and end the thread. Why is it the last one? Because this list is used as a stack, the elements at the bottom of the stack must be the oldest element.
158}
159 nr_pdflush_threads --;
160 spin_unlock_irq (& pdflush_lock );
161 return 0;
Nr_pdflush_threads minus 1 to exit this thread.
162}
163

Did you do less work? Yes, it seems that the sigchld signal is not processed. In fact, all processes created with kthread are self-cleaned, and there is no need for the parent process wait and no zombie process will be generated. Please refer

File: kernel/workqueue. c

200/* sig_ign makes children autoreap: see do_policy_parent ().*/
201 SA. SA. sa_handler = sig_ign;
202 SA. SA. sa_flags = 0;
203 siginitset (& SA. SA. sa_mask, sigmask (sigchld ));
204 do_sigaction (sigchld, & SA, (struct k_sigaction *) 0 );

In addition, you can see the "consequence" of ignoring sigchld in detail on the sigaction manual page ":

POSIX.1-1990 disallowed setting the action for sigchld to sig_ign.
POSIX.1-2001 allows this possibility, so that ignoring sigchld can
Be used to prevent the creation of zombies (see Wait (2). Never-
Theless, the historical BSD and System V behaviours for ignoring
Sigchld differ, so that the only completely portable method
Ensuring that terminated children do not become zombies is to catch
The sigchld signal and perform a wait (2) or similar.

Undoubtedly, the Linux kernel conforms to the new POSIX standard, which also provides us with a "simple" method to avoid zombie processes. However, note that this method cannot be transplanted.

Please turn back and think about function _ pdflush () again. This time we are concerned about the competition:

135/* unlocked test is OK here */
136 If (nr_pdflush_threads <max_pdflush_threads)
137 start_one_pdflush_thread ();

Although unlocking to determine the number of threads will not cause data corruption, if several processes judge the value of nr_pdflush_threads in parallel, they all agree that there is room for growth in the number of threads, and then call start_one_pdflush_thread () to generate a new pdflush thread instance, the number of threads may exceed max_pdflush_threads, and the worst case may be twice.

Let's look at the following line:

152 PDF = list_entry (pdflush_list.prev, struct pdflush_work, list );
153 If (jiffies-PDF-> when_ I _went_to_sleep> 1 * Hz ){
154/* limit exit rate */
155 PDF-> when_ I _went_to_sleep = jiffies;
156 break;/* exeunt */
157}

Consider instant bursts of requests and then stop running at the same time. At this time, all the processes will not meet the 153 rows of determination when exiting, and then they will go to sleep, assuming that no new requests are initiated in the next n seconds, the maximum number of pdflush kernel threads will last for n seconds, which does not meet the original design requirement 3.

195 /*
196 * attempt to wake up a pdflush thread, and get it to do some work for you.
197 * returns zero if it indeed managed to find a worker thread, and passed your
198 * payload to it.
199 */
200 int pdflush_operation (void (* fN) (unsigned long), unsigned long arg0)
201 {
202 unsigned long flags;
203 int ret = 0;
204
205 If (fn = NULL)
206 bug ();/* hard to diagnose if it's deferred */
207
208 spin_lock_irqsave (& pdflush_lock, flags );
209 If (list_empty (& pdflush_list )){
210 spin_unlock_irqrestore (& pdflush_lock, flags );
211 ret =-1;
212} else {
213 struct pdflush_work * pdf;
214
215 PDF = list_entry (pdflush_list.next, struct pdflush_work, list );
216 list_del_init (& PDF-> list );
217 If (list_empty (& pdflush_list ))
218 last_empty_jifs = jiffies;
219 PDF-& gt; fn = FN;
220 PDF-> arg0 = arg0;
221 wake_up_process (PDF-> who );
222 spin_unlock_irqrestore (& pdflush_lock, flags );
223}
224 return ret;
225}
226

The above function is used to allocate a task to the pdflush thread. If there are currently Idle threads available, assign a task to it and wake it up for execution.

Summary:

Kernel programming requires careful thinking, which may lead to accidents if you are not very careful. No matter how short your code is, you must be cautious. Although the implementation of the pdflush thread pool has two competitions mentioned above, they will not cause very serious consequences, but do not meet the design requirements and cannot be implemented as a good implementation.

Note:

In this article, "kernel thread", "Thread" and "process" are used in combination, but in fact they all represent "kernel thread", and there is nothing wrong with this, "Thread" is short for "kernel thread", and "kernel thread" is essentially a group of "processes" that share the kernel data space. In some cases, the two are interchangeable, there is no major problem.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.