Linux Kernel timer mechanism

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In Linux kernel 2.4, the static timer mechanism in earlier kernel versions is removed, leaving only dynamic timers. Correspondingly, in the timer_bh () function, the run_old_timers () function is no longer used to run the old-fashioned static timer. The two concepts of dynamic timer and static timer are compared with the extended functions of the Linux kernel timer mechanism. Dynamic timer means that the timer queue of the kernel can be dynamically changed, however, there is no essential difference between the timer itself. Considering the limited capability of the static timer mechanism, the previous static timer mechanism was completely removed in Linux kernel 2.4.
7.6.1 description of the timer in the Linux Kernel
Linux defines the data structure timer_list in the include/Linux/Timer. h header file to describe a kernel Timer:
Struct timer_list {
Struct list_head list;
Unsigned long expires;
Unsigned long data;
Void (* function) (unsigned long );
};
The meaning of each data member is as follows:
(1) List of two-way linked list elements: used to connect multiple timers into a two-way cyclic queue.
(2) expires: specifies the time when the timer expires. This time is expressed as the tick count (that is, the number of clock beats) since the system starts ). When the expires value of a timer is smaller than or equal to the jiffies variable, we will say that the timer has timed out or expired. After a timer is initialized, The expires field is usually set to the current value of the current expires variable plus a time interval value (based on the number of times the clock is ticking ).
(3) function pointer: point to an executable function. When the timer expires, the kernel executes the function specified by the function. The data field is used by the kernel as a function call parameter.
The kernel function init_timer () is used to initialize a timer. In fact, this initialization function only initializes the list members in the structure as null. (Include/Linux/Timer. h ):
Static inline void init_timer (struct timer_list * timer)
{
Timer-> list. Next = timer-> list. Prev = NULL;
}
Because the timer is usually connected to a bidirectional cyclic queue for execution (in this case, we say the timer is in pending state ). Therefore, the time_pending () function can judge whether a timer is in pending state by using whether the List member is empty. As shown below

(Include/Linux/Timer. h ):
Static inline int timer_pending (const struct timer_list * timer)
{
Return timer-> list. Next! = NULL;
}
Time Comparison operation
In the timer application, we often need to compare two time values to determine whether timer times out. Therefore, the Linux kernel defines four time relationship comparison operation Macros in the timer. h header file. Here we say that moment a is after moment B, which means that the time value A is greater than or equal to B. Linux strongly recommends that you use the following four time comparison operation macros (include/Linux/Timer. h) defined by it ):
# Define time_after (A, B) (long) (B)-(long) (a) <0)
# Define time_before (a, B) time_after (B,)
# Define time_after_eq (A, B) (long) (A)-(long) (B)> = 0)
# Define time_before_eq (a, B) time_after_eq (B,)
7.6.2 principle of dynamic kernel timer mechanism
How does Linux provide dynamic expansion capability for its kernel timer mechanism? The key lies in the concept of "timer vector. The so-called "timer vector" refers to such a bidirectional cyclic timer queue (each element in the column is in a timer_list structure): All the timers in the column expire at the same time, that is, each timer_list structure in the column has the same expires value. Obviously, a pointer of the timer_list structure can be used to represent a timer vector.
Obviously, the difference between the value of the timer expires member and the jiffies variable determines how long a timer will expire. In a 32-bit system, the maximum value of the time difference is 0 xffffffff. Therefore, if it is based on the basic definition of the "timer vector", the kernel will maintain at least 0xffffffff pointer of the timer_list structure type, which is obviously unrealistic.
On the other hand, from the perspective of the kernel itself, the timer it cares about is obviously not the timer that has expired and been executed (these timers can be discarded ), it is not the timer that will expire after a long time, but the timer that has expired or is about to expire soon (note! The interval is measured by the number of tick answers ).

Based on the above considerations, it is assumed that a timer will expire only after an interval clock tick (interval = expires-jiffies), Linux uses the following idea to implement its dynamic kernel timer mechanism: for timer 0 ≤ interval ≤ 255, Linux organizes the timer according to the Basic Semantics of the timer vector, that is, the Linux kernel is most concerned with the timer that will expire in the next 255 clock beats, so they are organized into 256 timer vectors according to their respective expires values. For those 256 ≤ interval ≤ 0xffffffff timers, the kernel does not care about them because they are still out of date, instead, they are organized in an extended timer vector semantics (or "loose timer vector Semantics. The so-called "loose timer vector Semantics" means that the expires values of each timer can be different from each other in a timer queue.
The specific organization scheme can be divided into two parts:
(1) For the first 0,255 timer vectors with the interval value between [256], the kernel organizes them as follows: these 256 timer vectors are organized together to form an array of timer vectors and serve as part of the data structure timer_vec_root. The data structure is defined in kernel/Timer. the C file is shown in the following code segment:
/*
* Event timer code
*/
# Define tvn_bits 6
# Define tvr_bits 8
# Define tvn_size (1 <tvn_bits)
# Define tvr_size (1 <tvr_bits)
# Define tvn_mask (tvn_size-1)
# Define tvr_mask (tvr_size-1)
Struct timer_vec {
Int index;
Struct list_head VEC [tvn_size];
};
Struct timer_vec_root {
Int index;
Struct list_head VEC [tvr_size];
};
Static struct timer_vec TV5;
Static struct timer_vec TV4;
Static struct timer_vec TV3;
Static struct timer_vec TV2;
Static struct timer_vec_root TV1;
Static struct timer_vec * const tvecs [] = {
(Struct timer_vec *) & TV1, & TV2, & TV3, & TV4, & TV5
};
# Define noof_tvecs (sizeof (tvecs)/sizeof (tvecs [0])
Based on the data structure timer_vec_root, Linux defines a global variable TV1, which indicates the first 256 timer vectors that the kernel cares about. In this way, when the kernel processes whether there is an expiration timer, it only scans from a timer vector in the timer Vector Array tv1.vec [256. The index field of TV1 specifies which timer vector is being scanned in the timer Vector Array tv1.vec [256], that is, the index of the array. Its initial value is 0, the maximum value is 255 (in 256 mode ). The index field is added with 1 for each clock cycle. Obviously, the timer vector tv1.vec [Index] specified by the index field contains all the dynamic timers that have expired in the current clock cycle. The timer vector tv1.vec [index + k] contains all the dynamic timers that will expire at the next K clock beats. When the index value is changed to 0 again, it means that the kernel has scanned all the 256 timer vectors in the TV1 variable. In this case, the timer vectors organized by the loose timer vector semantics must be added to TV1.

(2) For the timer between [0xff, 0 xffffffff], the degree of urgency of interval varies with the interval value. Obviously, the smaller the interval value, the more pressing the timer. Therefore, we should treat them differently when organizing them with a loose timer vector. Generally, the smaller the interval value of the timer, the less loose the timer vector (that is, the smaller the expires value of each timer in the vector); the larger the interval value, the greater the loose of the timer vector (that is, the greater the expires Value Difference of each timer in the vector ).
The kernel stipulates that for timer that meets the condition: 0x100 ≤ interval ≤ 0x3fff, as long as the expression (interval> 8) all timers with the same value will be organized in the same loose timer vector. Therefore, to organize all timers that meet the condition 0x100 ≤ interval ≤ 0x3fff, 26 = 64 loose timer vectors are required. Similarly, for convenience, these 64 loose timer vectors are put together to form an array and serve as part of the data structure timer_vec. Based on the data structure timer_vec, Linux defines the global variable TV2 to represent the 64 loose timer vectors. As shown in the preceding code snippet.
For timers that meet the condition 0x4000 ≤ interval ≤ 0xfffff, as long as the expressions (interval> 8 + 6) have the same value, the timers will be placed in the same loose timer vector. Similarly, to organize all timers that meet the condition 0x4000 ≤interval ≤0xfffff, 26 = 64 loose timer vectors are also required. Similarly, these 64 loose timer vectors can also be described in a timer_vec structure. Accordingly, Linux defines the TV3 global variable to represent these 64 loose timer vectors.
For timers that meet the condition 0x100000 ≤ interval ≤ 0x3ffffff, as long as the expressions (interval> 8 + 6 + 6) have the same value, the timers will be placed in the same loose timer vector. Similarly, to organize all timers that meet the condition 0x100000 ≤interval ≤0x3ffffff, 26 = 64 loose timer vectors are also required. Similarly, these 64 loose timer vectors can also be described in a timer_vec structure. Accordingly, Linux defines the TV4 global variable to represent these 64 loose timer vectors.

For those timers that meet the condition 0x4000000 ≤ interval ≤ 0xffffffff, as long as the expression (interval> 8 + 6 + 6 + 6) all timers with the same value will be placed in the same loose timer vector. Similarly, to organize all timers that meet the condition 0x4000000 ≤ interval ≤ 0xffffffff, 26 = 64 loose timer vectors are also required. Similarly, these 64 loose timer vectors can also be described in a timer_vec structure. Accordingly, Linux defines the TV5 global variable to represent these 64 loose timer vectors.
Finally, for reference convenience, Linux defines a pointer array tvecs [] to point to the TV1, TV2,..., and TV5 Structure Variables respectively. As shown in the above Code.
7.6.3 Implementation of kernel dynamic timer mechanism
In the implementation of the kernel dynamic timer mechanism, three operations are very important: (1) Insert a timer into the timer vector in which it should be located. (2) the migration of a timer also migrates a timer from its original timer vector to another timer vector. (3) scan and execute the timer that has expired.
7.6.3.1 initialization of the dynamic timer mechanism
The init_timervecs () function initializes the dynamic timer mechanism. This function is called only by the sched_init () initialization routine. The main task in the initialization process of the dynamic timer mechanism is to initialize the timer vector pointer array VEC [] in the five Structure Variables of TV1, TV2,... and TV5 to null. As follows (kernel/Timer. C ):
Void init_timervecs (void)
{
Int I;
For (I = 0; I <tvn_size; I ++ ){
Init_list_head (tv5.vec + I );
Init_list_head (tv4.vec + I );
Init_list_head (tv3.vec + I );
Init_list_head (tv2.vec + I );
}
For (I = 0; I <tvr_size; I ++)
Init_list_head (tv1.vec + I );
}
The macro tvn_size in the preceding function refers to the size of the timer vector pointer array VEC [] in the timer_vec structure type. The value is 64. Macro tvr_size refers to the size of the timer Vector Array VEC [] in the timer_vec_root structure type. The value is 256.

7.6.3.2 clock tick benchmark of the dynamic timer timer_jiffies
Because the dynamic timer is executed in the bottom half with clock interruption, several clock interruptions may occur during the period from when the timer_bh vector is activated to the real execution of its timer_bh () function. Therefore, the kernel must remember when the last timer mechanism was run, that is, the kernel must save the jiffies value of the last timer mechanism. Therefore, Linux defines the global variable timer_jiffies in the kernel/Timer. c file to indicate the jiffies value when the timer mechanism was last run. The variable is defined as follows:
Static unsigned long timer_jiffies;
7.6.3.3 protection for Kernel dynamic timer linked list
As the kernel dynamic timer linked list is a system that shares resources globally, in order to achieve mutually exclusive access to it, Linux defines a dedicated spin lock timerlist_lock to protect it. Any code segment that wants to access the dynamic timer linked list must first hold the spin lock and release the spin lock after the access. The definition is as follows (kernel/Timer. C ):
/* Initialize both explicitly-Let's try to have them in the same cache line */
Spinlock_t timerlist_lock = spin_lock_unlocked;
7.6.3.4 insert a timer into the linked list
The add_timer () function is used to insert the timer pointed to by the timer pointer to a suitable timer linked list. It first calls the timer_pending () function to determine whether the specified timer is already waiting for execution in a timer vector. If yes, no operation is performed, but a kernel alarm message is printed and returned. If not, the internal_add_timer () function is called to complete the actual insert operation. The source code is as follows (kernel/Timer. C ):
Void add_timer (struct timer_list * timer)
{
Unsigned long flags;
Spin_lock_irqsave (& timerlist_lock, flags );
If (timer_pending (timer ))
Goto bug;
Internal_add_timer (timer );
Spin_unlock_irqrestore (& timerlist_lock, flags );
Return;
BUG:
Spin_unlock_irqrestore (& timerlist_lock, flags );
Printk ("bug: Kernel timer added twice at % P.
",
_ Builtin_return_address (0 ));
}
The internal_add_timer () function is used to insert a timer that is not in any timer vector into the timer vector in which it should be located (based on the expires value of the timer ). As follows (kernel/Timer. C ):

Static inline void internal_add_timer (struct timer_list * timer)
{
/*
* Must Be cli-ed when calling this
*/
Unsigned long expires = timer-> expires;
Unsigned long idx = expires-timer_jiffies;
Struct list_head * VEC;
If (idx <tvr_size ){
Int I = expires & tvr_mask;
VEC = tv1.vec + I;
} Else if (idx <1 <(tvr_bits + tvn_bits )){
Int I = (expires> tvr_bits) & tvn_mask;
VEC = tv2.vec + I;
} Else if (idx <1 <(tvr_bits + 2 * tvn_bits )){
Int I = (expires> (tvr_bits + tvn_bits) & tvn_mask;
VEC = tv3.vec + I;
} Else if (idx <1 <(tvr_bits + 3 * tvn_bits )){
Int I = (expires> (tvr_bits + 2 * tvn_bits) & tvn_mask;
VEC = tv4.vec + I;
} Else if (signed long) idx <0 ){
/* Can happen if you add a timer with expires = jiffies,
* Or you set a timer to go off in the past
*/
VEC = tv1.vec + tv1.index;
} Else if (idx <= 0 xfffffful ){
Int I = (expires> (tvr_bits + 3 * tvn_bits) & tvn_mask;
VEC = tv5.vec + I;
} Else {
/* Can only get here on ubuntures with 64-bit jiffies */
Init_list_head (& timer-> list );
Return;
}
/*
* Timers are FIFO!
*/
List_add (& timer-> list, Vec-> PREV );
}
The comments to this function are as follows:

(1) first, calculate the expires value of the timer and the interpolation of timer_jiffies (note! Here we should use the dynamic timer's own time reference). This difference indicates how long the timer will expire from the time when the timer was last run. The local variable idx saves the difference value.
(2) determine the timer vector to which the timer should be inserted based on the idx value. The specific determination method is described in section 7.6.2. Finally, VEC indicates the head of the timer vector linked list where the timer should be located.
(3) Finally, call the list_add () function to insert the timer to the end of the timer queue pointed to by the VEC pointer.
7.6.3.5 modify the expires value of a timer
When a timer has been inserted into the kernel dynamic timer linked list, we can also modify the expires value of the timer. The mod_timer () function implements this. As follows (kernel/Timer. C ):
Int mod_timer (struct timer_list * timer, unsigned long expires)
{
Int ret;
Unsigned long flags;
Spin_lock_irqsave (& timerlist_lock, flags );
Timer-> expires = expires;
Ret = detach_timer (timer );
Internal_add_timer (timer );
Spin_unlock_irqrestore (& timerlist_lock, flags );
Return ret;
}
This function first updates the expires Member of the timer based on the expires value. Call the detach_timer () function to delete the timer from its original linked list. Finally, call the internal_add_timer () function to re-insert the timer to the corresponding linked list based on its new expires value.
The detach_timer () function first calls timer_pending () to determine whether the specified timer is in a linked list. If the timer is not in any linked list, the detach_timer () function does not do anything, 0 is returned directly, indicating that the operation failed. Otherwise, call the list_del () function to remove the timer from its original linked list. As follows (kernel/Timer. C ):

Static inline int detach_timer (struct timer_list * timer)
{
If (! Timer_pending (timer ))
Return 0;
List_del (& timer-> list );
Return 1;
}
7.6.3.6 delete a timer
The del_timer () function is used to delete a timer from the corresponding kernel timer queue. This function is actually a high-level encapsulation of the detach_timer () function. As follows (kernel/Timer. C ):
Int del_timer (struct timer_list * timer)
{
Int ret;
Unsigned long flags;
Spin_lock_irqsave (& timerlist_lock, flags );
Ret = detach_timer (timer );
Timer-> list. Next = timer-> list. Prev = NULL;
Spin_unlock_irqrestore (& timerlist_lock, flags );
Return ret;
}
7.6.3.7 timer migration
Because the interval value of a timer will decrease as time goes by (that is, the value of jiffies increases, therefore, the timer with a low degree of urgency will become a timer that will expire immediately as the jiffies value increases. For example, the timer in the timer vector tv2.vec [0] will become the timer that will expire in the next 256 tick-offs after 256 tick-outs. Therefore, the position of the timer in the kernel dynamic timer linked list should also change accordingly. The changed rule is: When tv1.index is changed to 0 again (meaning that all the 256 timer vectors in TV1 have been scanned by the kernel, so that the 256 timer vectors in TV1 become empty ), TV1 is filled with the timer in the timer vector of tv2.vec [Index], and tv2.index is added with 1 (it is modeled as 64 ). When tv2.index is changed to 0 again (meaning that all the 64 timer vectors in TV2 have been filled in TV1 to make TV2 empty ), TV2 is filled with the timer in the timer vector of tv3.vec [Index. And so on until tv5.

The cascade_timers () function completes the timer migration. This function only has a timer_vec structure type pointer parameter TV. This function will refill all the timers in the timer vector TV-> VEC [TV-> Index] into the previous timer vector. As shown below (kernel/Timer. c): static inline void cascade_timers (struct timer_vec * TV)
{
/* Cascade all the timers from TV up one level */
Struct list_head * head, * curr, * next;
Head = TV-> VEC + TV-> index;
Curr = head-> next;
/*
* We are removing _ all _ timers from the list, so we don't have
* Detach them individually, just clear the list afterwards.
*/
While (curr! = Head ){
Struct timer_list * TMP;
TMP = list_entry (curr, struct timer_list, list );
Next = curr-> next;
List_del (curr); // not needed
Internal_add_timer (TMP );
Curr = next;
}
Init_list_head (head );
TV-> Index = (TV-> index + 1) & tvn_mask;
}
Note the function as follows:
(1) first, use the pointer head to point to the list_head structure of the timer header vector header. The pointer curr points to the First timer in the timer vector.
(2) then, use a while {} loop to traverse the timer vector TV-> VEC [TV-> Index]. Because the timer vector is a bidirectional cyclic queue, the cycle termination condition is curr = head. For each scanned timer, the loop first calls the list_del () function to remove the current timer from the linked list, and then calls internal_add_timer () the function re-determines the timer vector to which the timer should be placed.
(3) After exiting from the while {} loop, all the timers in the timer vector TV-> VEC [TV-> Index] have been migrated to other places (to where they should stay:-), so it itself becomes an empty queue. Here we call the init_list_head () macro to initialize the header structure of the timer vector to be empty.

(4) At last, add the TV-> index value to 1. Of course, it is modeled as 64.
7.6.4.8 scan and execute a timer that has expired
The run_timer_list () function completes this function. As mentioned above, this function is called by the timer_bh () function, so the kernel timer is executed in the bottom half with clock interruption. It is important to remember this. The global variable timer_jiffies indicates the time when the kernel last executes the run_timer_list () function. Therefore, the difference between jiffies and timer_jiffies indicates that since the last timer processing, the total number of clock interruptions occurred during this period. Obviously, the run_timer_list () function must be used to complete the timer service for each clock interruption occurred during this period. The source code of this function is as follows (kernel/Timer. C ):
Static inline void run_timer_list (void)
{
Spin_lock_irq (& timerlist_lock );
While (long) (jiffies-timer_jiffies)> = 0 ){
Struct list_head * head, * curr;
If (! Tv1.index ){
Int n = 1;
Do {
Cascade_timers (tvecs [N]);
} While (tvecs [N]-> Index = 1 & ++ n <noof_tvecs );
}
Repeat:
Head = tv1.vec + tv1.index;
Curr = head-> next;
If (curr! = Head ){
Struct timer_list * timer;
Void (* fN) (unsigned long );
Unsigned long data;
Timer = list_entry (curr, struct timer_list, list );
Fn = timer-> function;
Data = timer-> data;
Detach_timer (timer );
Timer-> list. Next = timer-> list. Prev = NULL;
Timer_enter (timer );
Spin_unlock_irq (& timerlist_lock );
FN (data );
Spin_lock_irq (& timerlist_lock );
Timer_exit ();
Goto repeat;
}
++ Timer_jiffies;
Tv1.index = (tv1.index + 1) & tvr_mask;
}
Spin_unlock_irq (& timerlist_lock );
}
The execution process of the function run_timer_list () is mainly to use a large while {} loop to execute the timer service for the clock interruption, and a clock interruption is performed for each loop service. So a total of (jiffies-timer_jiffies + 1) cycles are executed. The service steps of the loop body are as follows:

(1) first, determine whether tv1.index is 0. If it is 0, add the timer from TV2 to TV1. However, TV2 may also be empty and the timer needs to be added from TV3. Therefore, a do {} while loop is used to call the cascade_timer () function to supplement TV1 from TV2 as needed, supplement TV2,... from TV3 ,... add TV4 from TV5. Obviously, if TVI. Index = 0 (2 ≤ I ≤ 5), after TVI executes the cascade_timers () function, TVI. index must be 1. In turn, if you have executed the cascade_timers () function for TVI, then TVI. if the index is not equal to 1, it is certain that TVI is not executed before the cascade_timers () function is executed on TVI. the index value is certainly not 0, so TVI does not need to add a timer from TV (I + 1), then the do {} while loop can be terminated.
(2) Next, execute all expiration timers in the timer vector tv1.vec [tv1.index. Therefore, a goto repeat loop is used to scan the entire timer column from start to end. Since you do not need to disable CPU interruption when executing the timer's correlated function, you can call the spin_unlock_irq () function after you use the detach_timer () function to remove the current timer from the pair column () the function is unlocked and interrupted. After the Associated Function of the current timer is executed, the system locks and closes the locks and disconnections with the spin_lock_irq () function.
(3) After all the expired timers in the timer vector tv1.vec [tv1.index] are executed, tv1.vec [tv1.index] should be an empty queue. So far, the timer service has ended.
(4) Finally, add the timer_jiffies value to 1 and the tv1.index value to 1. Of course, the modulo is 256. Then, return to the while loop to start the next timer service.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux Kernel timer mechanism

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support