In this time, I will talk about things related to time. We are familiar with this, and I am just like the theme.
First, we need to understand two concepts: system timer and dynamic timer. Periodic events are all driven by a system timer. The system timer here is a programmable hardware chip that can interrupt at a fixed frequency. The interrupt refers to the timer interrupt, and the corresponding interrupt processing.ProgramUpdates the system time and executes tasks that require periodical running. The system timer and clock interrupt handler are the centers in the Linux system kernel management mechanism. A dynamic timer is a tool used to delay the execution of programs. The kernel can dynamically create or destroy dynamic timers.
The kernel can calculate and manage time only with the help of hardware. The hardware provides a system timer for the kernel to calculate the elapsed time. The clock can be viewed as an electronic time resource in the kernel. The system timer triggers the clock interruption at a certain frequency, which can be called the tick rate by programming ). when a clock interrupt occurs, the kernel processes it through a special interrupt handler. The system timer frequency (Cycle rate) is defined through static preprocessing, that is, Hz. The hardware is set according to the Hz value when the system is started. Different architectures and Hz values are defined in ASM/Param. h. This is what we just mentioned. The cycle is 1/Hz seconds. The last note is that the Hz value is written in the kernel.CodeIs not fixed, but adjustable. Of course, this fixed clock interruption is not required for the operating system. In fact, the kernel can use a dynamic programming timer to operate the suspended event. I will not talk about it here.
In the Linux kernel, a variable named jiffies (defined in Linux/jiffies) records the total number of beats generated since the system was started. At startup, the kernel initializes the variable to 0, and the value of this variable is increased each time the clock is interrupted. Because the number of clock interruptions in one second is equal to Hz, the value of jiffies added in one second is Hz. the system running time is calculated in seconds, which is equivalent to jiffes/Hz. as a variable represented by a computer, it will always have a size. When this variable is increased beyond its upper limit, it will be rolled back to 0. this round robin seems simple, but it actually causes a lot of trouble for programming, such as the boundary condition judgment. Fortunately, the kernel provides four macros to help compare the cycle count. These macros are defined in Linux/jiffies. H, which can handle the cycle loop well:
Note: The unknown parameter is usually jiffies, and the known parameter is the value to be compared.
If you change the Hz value in the kernel, some programs in the user space will cause abnormal results. This is because the kernel exports this value to the user space in the form of beats/s, after this interface has been stabilized for a long time, the application gradually depends on the specific Hz value. Therefore, if the defined Hz value is changed in the kernel, the constant relation of the user space is broken-the user space does not know the new Hz value. To solve this problem, the kernel must change the value of all exported jiffies. The kernel defines user_hz to represent the Hz value seen by the user space. The kernel can use the macro jiffies_to_clock_t () to convert the Count of beats represented by Hz into a count of beats represented by user_hz. The usage of macro modification depends on whether user_hz is an integer multiple of Hz or vice versa. When it is an integer multiple, the macro format is quite simple:
# Define jiffies_to_clock_t (x) (X)/(Hz/user_hz ));
If it is not an integer multiple, the macro will use a more complex one.Algorithm. Similarly, for a 64-bit system, the kernel uses the jiffies_64_to_clock () function to convert the unit of the 64-bit jiffies value from Hz to user_hz.
The architecture provides two types of devices for timing: system timer and real-time clock. The system timer provides a periodic interrupt trigger mechanism. Real-time clock (RTC) is a device used to persistently store system time. Even after the system is turned off, it can also rely on the timing of the power protection system provided by the micro-battery on the motherboard. When the system starts, the kernel reads RTC to initialize the wall time, which is stored in the xtime variable. The main function of the real-time clock is to initialize the xtime variable at startup.
With the above concepts, we will analyze the clock interrupt processing program. It consists of two parts: the architecture and the architecture. The related part is registered to the kernel as the interrupt handler of the system timer, so that it can run properly when a clock interruption occurs. The task is as follows:
1. Obtain the xtime_lock to protect access to jiffies_64 and wall time xtime. 2. respond or reset the system clock as needed. 3. Use the wall time periodically to update the real-time clock. 4. Call the time routines unrelated to the architecture: do_timer (). The interrupted service program calls the architecture-independent routine do_timer () to execute the following tasks: 1. Add 1 to the jiffies_64 variable. 2. Update the statistical value of resource consumption, such as the system time and user time consumed by the current process. 3. Execute the expired dynamic timer. 4. Execute the scheduler_tick () function. 5. Update the wall time, which is stored in the xtime variable. 6. Calculate the average load value. |
Do_timer seems very simple. The main task of do_timer should be to complete the above framework, and let other functions do exactly as well:
Void do_timer (struct pt_regs * regs) {jiffies_64 ++; update_process_times (user_mode (regs); update_times ();}
The preceding user_mode () macro queries the status of the Processor register regs. If the clock is interrupted in the user space, 1 is returned; If the clock is interrupted in the kernel mode, 0 is returned. the update_process_times () function updates the user or the system time based on the location where the clock interruption occurs:
Void update_process_times (INT user_tick) {struct task_struct * P = current; int CPU = smp_processor_id (); INT system = user_tick ^ 1; updata_one_process (p, user_tick, system, CPU ); run_local_timers (); scheduler_tick (user_tick, system );}
The update_one_process () function updates the process time. Its implementation is quite meticulous. Note that because XOR is used, the user_tick and system variables must be 0 if one of them is 1. The updates_one_process () function can determine the branch, add user_tick and system to the corresponding process count:
P-> utime = user; P-> stime = system;
The above operation will increase the appropriate Count value by 1, while the other value remains unchanged. You may have discovered that this means that when the kernel counts the process time, it classifies statistics based on the mode in which the processor is located when the interruption occurs. It counts all the previous tick to the process. But in fact, the process may enter and exit the kernel mode multiple times in the previous node. During the last cycle, the process is not necessarily the only running process, but there is no way to do this. The next run_lock_times () function marks a soft interrupt to process all expired timers. Finally, the scheduler_tick () function is used to reduce the time slice Count value of the current running process and set the need_resched flag as needed. In SMP machines, this function is also used to balance the running queues on each processor. When the update_process_times () function returns, the do_timer () function then calls update_times () to update the wall time.
Void update_times (void) {unsigned long ticks; If (ticks) {wall_jiffies + = ticks; update_wall_time (ticks);} last_time_offset = 0; calc_load (ticks );}
Here, ticks records the number of new beats generated after the last update. Generally, ticks should be equal to 1, but the clock interruption may also be lost, so the cycle will also be lost. This can happen when the interruption is disabled for a long time (this is not a common case, often a bug ). the wall_jiffies value is subsequently added with ticks ---- so at this moment the wall_jiffies value is equal to the updated wall Time Value jiffies ---- then call the update_wall_time () function to update the xtime, which is finally executed by calc_load. After the do_timer () function is executed, the system structure-related interrupt processing program is returned, and the subsequent work is continued. Release the xtime_lock lock and exit. The above work occurs once every 1/Hz.
The wall time mentioned above is the actual time we often call. It refers to the variable xtime, defined by the timespec of the struct (kernel/Timer. c), as follows:
Structtimespec {time_t TV _sec; // second, which stores the time since January 1, July 1, 1970 (UTC), known as epoch long TV _nsec in January 1, July 1, 1970; // nanoseconds, recording the number of nanoseconds that have elapsed since the last second}
The xtime_lock lock is required to read and write the xtime variable. The lock is a sequential lock (seqlock). We will not talk about Kernel read and write. Please add the appropriate unlock. Return to the user space. the main interface for obtaining the wall time from the user space is gettimeofday (). In the kernel, the corresponding system call is sys_gettimeofday ():
asmlinkage long sys_gettimeofday (struct timeval _ User * TV, struct timezone _ User * tz) {If (likely (TV! = NULL) {struct timeval KTV; do_gettimeofday (& KTV); If (copy_to_user (TV, & KTV, sizeof (KTV) Return-efault ;} if (unlikely (tz! = NULL) {If (copy_to_user (tz, & sys_tz, sizeof (sys_tz) Return-efault;} return 0 ;}
Analysis of the above functions found that the problem is concentrated on TV. When TV is not empty, do_gettimeofday () is called, which completes the xtime loop reading operation. If the TZ parameter is null, the function returns the system time zone (stored in sys_tz) to the user. If an error occurs when copying a wall time or time zone to a user space, the function returns-efault. If the operation succeeds, 0 is returned. in addition, the time system calls provided by the kernel are almost completely replaced by gettimeofday. Library C functions provide some library calls related to wall time, such as ftime and ctime. The settimeofday () of the system is used to set the current time. It requires the cap_sys_time permission. In addition to updating xtime, the kernel does not frequently use xtime as the user space program does. However, you also need to note that xtime is required to store the access timestamp in the file system implementation code.
The previous topic about hard clock is about timer (also known as dynamic timer or kernel timer ). The timer is not periodically executed. It is destroyed after timeout. The definer is represented by time_list defined in Linux/Timer. H, as follows:
Struct timer_list {struct list_head entry; unsigned long expires; spinlock_t lock; unsigned long magic; void (* function) (unsigned long); unsigned long data; struct tvec_t_base_s * base ;};
The kernel provides a set of timer-related operations to simplify timer management. All these interfaces are declared in the file Linux/Timer. H. Most interfaces are implemented in the file kernel/Timer. C. With these interfaces, what we need to do is simple:
1. Create a Timer: struct timer_list my_timer; 2. initialize the Timer: init_timer (& my_timer ); 3. Set the timer as needed: My_timer.expires = jiffies + delay; My_timer.data = 0; My_timer.function = my_function; 4. Activate the Timer: add_timer (& my_timer ); |
After the preceding steps, the timer can start to work. However, in general, the timer will be executed immediately after the time-out, but it may also be postponed until the next time to run, so it cannot be used for hard real-time. If you modify the timer, use mod_timer (& my_timer, jiffies + new_delay) to modify the activated timer time. It can also operate on the timer that has been initialized but has not been activated. If the timer is not activated, mod_timer will activate it. If the timer is not activated, the function returns 0; otherwise, 1 is returned. However, in either case, the timer is activated and a new scheduled value is set once it is returned from the mod_timer function. Of course, you can also delete the timer in front of the supermarket: del_timer (& my_timer); also note that the timer interruption on the multi-processor may already be running on other machines, this requires you to wait until all timer handlers that may run on Other Processors exit and then delete the timer. This requires using the del_timer_sync () function to perform the deletion. This function parameter is the same as the preceding one, but cannot be used in the interrupt context. The timer is independent from the current Code, which means there may be competition conditions. This requires special attention. In this sense, the latter deletion is safer than the former.
The kernel executes the timer after the clock interruption occurs. The timer is executed in the lower half of the context as the software interruption. Specifically, the clock interrupt handler executes the update_process_timers () function, which then calls the run_local_timers () function:
Void run_local_timers (void) {raise_softirq (timer_softirq );}
This function processes the Soft Interrupt tiemr_softirq and runs all the time-out timers on the current processor. All timers are organized in the form of a linked list, but if the simple linked list structure obviously affects the performance, because the sequential query adjustment is required each time, kernel timers divide them into five groups based on their timeout time. When the timer timeout time is close, the timer moves down along with the group. This method can reduce the burden caused by the time-out timer.
In the next topic, kernel code (especially drivers) not only uses the timer or lower half mechanism, but also provides many latency methods to handle various latency requests. The following is a summary:
1. Busy waiting (also called a busy cycle): usually the most unsatisfactory method, because the processor is being rotated in vain and cannot do anything else. This method can be used only when the latency is an integer multiple of the cycle or the accuracy requirement is not high. It is quite easy to implement, that is, rotating continuously in the loop until the expected number of clock beats is exhausted. For example:
Unsigned long delay = jiffies + 10; // 10 beats while (time_before (jiffies, delay) cond_resched ();
The disadvantage is obvious. A better way is to allow the kernel to re-schedule other tasks while the code is waiting, as shown below:
Unsigned long delay = jiffies + 10; // 10 beats while (time_before (jiffies, delay) cond_resched ();
The cond_resched () function schedules a new program to run, but it takes effect only after the need_resched flag is set. In other words, there are more important tasks in the system that need to be run. Since this method needs to call the scheduler, it cannot be used in the interrupt context-it can only be used in the process context. In fact, all the latency methods are used in the context of the process, because the interrupt handler should be executed as quickly as possible. In addition, delayed execution should not occur when a lock is held or interrupted.
As for those that require short latency (shorter than the clock cycle) and require precise latency, this often happens when it is synchronized with hardware, that is to say, it takes a short wait for the completion of an action-the wait time is usually less than 1 ms, so it is impossible to use the latency Method Based on jiffies as in the previous example. In this case, you can use the two functions defined in Linux/delay. H. They are not used. These two functions can process latency in microseconds and milliseconds, as shown below:
Void udelay (unsigned long usecs); void mdelay (unsigned long msecs );
The former relies on execution cycles to achieve latency, while the mdelay () function is implemented through the udelay () function. Because the kernel knows how many cycles a processor can execute in one second, the udelay () function only needs to be based on the proportion of the specified delay time in one second, you can determine the number of cycles that need to be performed to achieve the required delay. The udelay () function can only be executed when the required latency is short, while a long delay in High-Speed machines may cause overflow, do not try to use this function when the latency exceeds 1 ms. These two functions are actually the same as waiting for a while. If not necessary, do not use them.
I'm a little scared. What should I do? In fact, the more ideal method for delayed execution is to use the schedule_timeout () function, which will make the task to be delayed to sleep until the specified delay time is exhausted and then run again. However, this method does not guarantee that the sleep time is exactly the same as the specified delay time-it is only possible that the sleep time is close to the specified delay time. After the specified time expires, the kernel wakes up the delayed task and puts it back into the running queue, as shown below:
Set_current_state (task_interruptible); schedule_timeout (S * Hz );
The only parameter is the relative delay time, in the unit of jiffies. In the previous example, the corresponding task is pushed into the interrupted sleep queue for S-seconds. Before calling the schedule_timeout function, do not set the task to be interrupted or not interrupted. Otherwise, the task will not sleep. This function needs to call the scheduler, so the code that calls it must be able to sleep. In short, the calling code must be in the context of the process and cannot hold a lock. For details about the implementation of this function, you can refer to the source code, which is quite simple. When the timer times out, the process_timeout () function is called:
Void process_timeout (unsigned long data) {wake_up_process (task_t *) data );}
This function sets the task status to task_running, and then places it in the running queue. When the task is rescheduled, the Code returned enters the pre-sleep position for further execution (exactly after schedule () is called ). If a task is awakened in advance (such as receiving a signal), the timer is destroyed, and the process_timeout () function returns the remaining time.
Finally, in the process scheduling section, we said that the code in the process context can put itself into the waiting queue to wait for a specific time. However, waiting for a task on the queue may be waiting for a specific event and waiting for a specific time to expire-it depends on who is coming faster. In this case, the code can simply use the scedule_timeout () function to replace the schedule () function. In this way, the task will be awakened when the specified time expires. Of course, the code needs to check the cause of the wakeup-either the event or the delay due or the signal received-and then perform the corresponding operation.