Hrtimer + clockevent + timekeeping

Last Update:2018-07-30 Source: Internet

Author: User

Tags posix set time

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Kernel-2.6.22 's arm arch has joined the Dynticks, clocksource/event support. Imx31 's BSP has some changes here in clock. Find some kernel clock and timer subsystem changes recently, summed up.
Generally speaking, Soft-timer (Timer Wheel/hrtimer) is driven by hardware-timer (clock interrupts) and associated clock source (e.g GPT in Soc), so I'm going to start with the clock layer, Then Soft-timer, kernel timekeeping, finally to see some applications.

Clock SourceClock source defines the basic properties and behavior of a clock device, which are typically counted, timed, and disruptive, such as GPT. The structure is defined as follows:

struct clocksource{
Char *name;
struct List_head list;
int rating;
cycle_t (*read) (void);
cycle_t Mask;
U32 mult; /* Cycle-> xtime interval, maybe two clock cycle trigger one interrupt (one xtime interval) * *
U32 shift;
unsigned long flags;
cycle_t (*vread) (void);
void (*resume) (void);

/* Timekeeping specific data, ignore * *
cycle_t Cycle_interval; /* Just the rate of GPT count per OS HZ */
U64 Xtime_interval; /* Xtime_interval = Cycle_interval * mult. */
cycle_t cycle_last ____cacheline_aligned_in_smp;/* Cycle in rate count */
U64 xtime_nsec; /* cycle count, remain from xtime.tv_nsec
* Now nsec rate count offset = xtime_nsec + * Xtime.tv_nsec << shift *
S64 error;
};

The most important members are read (), Cycle_last and Cycle_interval. The current count value interface for reading clock device count registers is defined, and the last cycle count value and each tick period interval value are saved. The values within this structure, whether cycle_t or u64 type (actual cycle_t is U64) are counted (cycle), not nsec, sec, and jiffies. Read () is an interface that reads the exact monotone time count for the entire kernel, and kernel uses it to compute other times, such as: Jiffies, Xtime.
The introduction of Clocksource, resolved before kernel each arch has its own clock device management methods, basically hidden in the MSL layer, kernel core and driver difficult to access the problem. It exports the following interfaces:
1) clocksource_register () Registration Clocksource
2) Clocksource_get_next () Get current Clocksource device
3) Clocksource_read () read clock, actually run to Clocksource->read ()
When the driver processing time accuracy is higher, may through the above interface, directly takes clock device to read.
Of course, the current ticker clock interrupt source will also be in the form of Clocksource.

Clock EventThe primary role of Clock event is to distribute the Clock event and set the next trigger condition. In the absence of clock event, clock interrupts are generated periodically, known as jiffies and Hz.
The main structure of the Clock Event device:

struct clock_event_device{
const char *name;
unsigned int features;
unsigned long Max_delta_ns;
unsigned long Min_delta_ns;
unsigned long mult;
int shift;
int rating;
int IRQ;
cpumask_t cpumask;
Int (*set_next_event) (Unsignedlong evt,
struct clock_event_device*);
void (*set_mode) (enum Clock_event_mode mode,
struct clock_event_device*);
void (*event_handler) (struct clock_event_device*);
void (*broadcast) (cpumask_t mask);
struct List_head list;
Enum Clock_event_mode mode;
ktime_t next_event;
};

The most important thing is set_next_event (), Event_handler (). The former is to set the next clock event trigger conditions, is generally to clock device to reset the timer. The latter is event handler, which is a handler function. The handler function is invoked in the clock interrupt ISR. If this clock is used as a ticker clock, then the handler execution is essentially the same as the previous kernel clock interrupt ISR, similar to the Timer_tick (). Event handlers can be dynamically replaced at run time, which gives kernel an opportunity to change the way the entire clock interrupts, as well as to highres tick and dynamic tick a dynamically mounted opportunity. At present, there are periodic/highres/dynamic tick three kinds of kernel internal clock interrupt processing. will be described later.

Hrtimer & Timer WheelFirst of all, Timer wheel. It is kernel has been used based on the jiffies timer mechanism, the interface includes Init_timer (), Mod_timer (), Del_timer (), etc., very familiar with the.
The advent of Hrtimer did not abandon the old timer wheel mechanism (and is unlikely to be discarded:)). Hrtimer as the timer timer in kernel, and timer wheel is mainly used to do timeout timer. The Division of labor is more clear. Hrtimers uses red-black trees to organize timers, while timer wheel uses linked lists and buckets.
Hrtimer accuracy is raised by the original timer wheel jiffies to nanosecond. It is mainly used to provide nanosleep, posix-timers and Itimer interfaces to the application layer, and of course drivers and other subsystems will need high resolution timer.
The kernel produces Hz ticker (interrupts) periodically per second and is replaced by an interruption at the time of the next expired Hrtimer. This means that the clock interrupt is no longer cyclical, but is driven by a timer (the next event interrupt is set by the Clockevent Set_next_event interface), and there is no interruption as long as there is no hrtimer load. However, in order to ensure the system time (process time statistics, jiffies maintenance) update, each tick_period (Nsec_per_sec/hz, again emphasizing hrtimer precision is NSEC) there will be a call tick_sched_ Timer's Hrtimer load.
The next step is to compare the processing of clock interrupts in the kernel before and after Hrtimer introduction. (This is all based on the arm arch source to analyze)
1) No HrtimerKernel up, the Time_init () after Setup_arch () initializes the timer under the corresponding machine structure. The initialization timer function is in each machine's architecture code, initializes the hardware clock, registers the interrupt service function, causes the clock to interrupt. The Interrupt Service program will clear the interrupt, call Timer_tick (), and it executes:
1. Profile_tick (); /* Kernel profile, not very understanding * *
2. Do_timer (1); /* Update Jiffies * *
3. Update_process_times (); /* The calculation process is time-consuming, evoking TIMER_SOFTIRQ (timer wheel), recalculate scheduling time slices, and so on * *
Finally interrupt the service program to set the timer so that it interrupts in the next tick.

Such a framework makes it difficult for high-res timer to join. All interrupt handling code is written in the architecture code, and the code reuse rate is low, after all, most arch will write the same interrupt handler function.
2) HrtimerWith the introduction of the Clockevent/source in the kernel, the interruption of Clocksource was abstracted in an event way. The handling of the event itself was given to event handler. Handler can be replaced in kernel to change the clock interrupt behavior. The clock interrupts the ISR will look like this:

static irqreturn_t timer_interrupt (int irq,void *dev_id)
{
/* Clear Timer Interrupt Flag * *
.....
/* Call Clock event handler * *
Arch_clockevent.event_handler (&arch_clockevent);
....
return irq_handled;
}

Event_handler is set to Tick_handle_periodic () by default when registering Clockevent device. So kernel just got up, the clock processing mechanism is still periodic, ticker interrupt cyclical generation. Tick_handle_periodic () will do the same thing as Timer_tick, and then call Clockevents_program_event () => arch_clockevent.set_next_event () To set the timer for the next cycle. TICK-COMMON.C in the original kernel clock processing way in the clockevent framework to achieve, this is the periodic tick clock mechanism.

HRes tick mechanism in the first timer SOFTIRQ will replace periodic tick, of course, to meet certain conditions, such as command line did not hres (Highres=off) banned, clocksource/ Event supports hres and oneshot capabilities. Here's a comparison of the ugly, the author of the comments also mentioned that each timer SOFTIRQ is scheduled to call Hrtimer_run_queues () to check whether the hres active, if the Timer_init () In the clocksource/event condition check, direct switch to HRes is the best, do not know if there are any restrictions on conditions. The TIMER SOFTIRQ code is as follows:

Staticvoid Run_timer_softirq (struct softirq_action*h)
{
tvec_base_t *base = __get_cpu_var (tvec_bases);

Hrtimer_run_queues (); * * To switch to hres or nohz when you have the opportunity

if (Time_after_eq (jiffies, base->timer_jiffies))
__run_timers (base); /* Timer Wheel * *
}

The process of switching is simpler, replacing the current clockevent hander with Hrtimer_interrupt (), loading a hrtimer:tick_sched_timer expires in the next tick_period, Retrigger next event.
Hrtimer_interrupt () takes the expired hrtimers from the red and black trees and puts them in the corresponding clock_base->cpu_base->cb_pending list, which expire timers in Hrtimer_ Executed in the SOFTIRQ. Then, according to the remaining oldest timer to Retrigger the next event, then schedule HRTIMER_SOFTIRQ. Hrtimer SOFTIRQ performs those expiration timer functions that are cb_pending again. Tick_sched_timer this hrtimer expires in each tick_period, and the execution process is similar to that of Timer_tick (), only to the last call Hrtimer_forward to load itself into the next cycle to ensure that each tick_ Period can correctly update kernel internal time statistics.

timekeeping

The timekeeping subsystem is responsible for updating the xtime, adjusting the error, and providing the Get/settimeofday interface. To facilitate understanding, first introduce some concepts:
Times in KernelTime base type of kernel:
1) System time
A monotonically Increasing value that represents the amount of the system has been. The monotonically growing system runtime can be computed by time source, Xtime and Wall_to_monotonic.
2) Wall Time
A value representing the "human time of", as seen on a wrist-watch. Realtime time: Xtime.
3) Time Source
A representation of a free running counter running at a known frequency, usually in hardware, e.g GPT. Counter value can be obtained by Clocksource->read ()
4) Tick
A periodic interrupt generated by a hardware-timer, typically with a fixed interval
Defined by hz:jiffies

These time are related to each other and can be converted.
System_time = Xtime + cyc2ns (Clock->read ()-clock->cycle_last) + wall_to_monotonic;
Real_time = Xtime + cyc2ns (Clock->read ()-Clock->cycle_last)
That is, real time is nanosecond from 1970 onwards to the present, and system time is the nanosecond that is booting to the present.
These two are the most important time, from which Hrtimer can set the expiration time based on these two times. So we introduced two clock base.

Clock BaseClock_realtime:base in the actual wall time
Clock_monotonic:base on System Run
Hrtimer can choose one of these to set expire time, either actual or relative to the system.
They provide get_time () interface:
Clock_realtime calls Ktime_get_real () to get the real time, which computes the realtime using the equation mentioned above.
Clock_monotonic calls Ktime_get () and obtains monotonic time with the System_time equation.

Timekeeping provides two interface Do_gettimeofday ()/do_settimeofday (), all for realtime operations. User space to Gettimeofday's syscall will end up here, too.
Do_gettimeofday () will call __get_realtime_clock_ts () to get the time and then turn to Timeval.
Do_settimeofday (), updates the user's set time to Xtime, recalculates xtime to monotonic conversion values, and finally notifies Hrtimers subsystem time changes.

int Do_settimeofday (Structtimespec *tv)
{
unsigned long flags;
time_t wtm_sec, sec= tv->tv_sec;
Long wtm_nsec, nsec= tv->tv_nsec;

if ((unsignedlong) tv->tv_nsec>= nsec_per_sec)
Return-einval;

Write_seqlock_irqsave (&xtime_lock, flags);

Nsec-= __get_nsec_offset ();

wtm_sec= wall_to_monotonic.tv_sec+ (XTIME.TV_SEC-SEC);
Wtm_nsec = wall_to_monotonic.tv_nsec+ (xtime.tv_nsec-nsec);

Set_normalized_timespec (&xtime, sec, nsec); /* Recalculate Xtime: User set time minus the previous period to the current NSEC * *
Set_normalized_timespec (&wall_to_monotonic, wtm_sec, wtm_nsec); /* Re-adjust wall_to_monotonic * *

Clock->error= 0;
Ntp_clear ();

Update_vsyscall (&xtime,clock);
Write_sequnlock_irqrestore (&xtime_lock, flags);
/* Signal hrtimers About time Change * *
Clock_was_set ();

return 0;
}

Userspace ApplicationHrtimer's introduction, the most useful interface for users is as follows:

Clock API
Clock_gettime (clockid_t, struct timespec *)
Get the corresponding clock time
Clock_settime (clockid_t, const struct TIMESPEC *)
Set corresponding clock time
Clock_nanosleep (clockid_t, int, const struct TIMESPEC *, struct timespec *)
Process Nano Sleep
Clock_getres (clockid_t, struct timespec *)
Get the time precision, is generally nanosec

The clockid_t defines four kinds of clock:
Clock_realtimeSystem-wide Realtime clock. Setting this clock requires appropriate privileges. Clock_monotonicClock that cannot is set and represents monotonic time since some unspecified point. clock_process_cputime_idHigh-resolution per-process timer from the CPU. clock_thread_cputime_idThread-specific cpu-time clock. As mentioned earlier, the latter two are related to the process/thread statistics time and have not been carefully studied, such as utime/stime time. The application layer can use these four kinds of clock to improve the flexibility and accuracy.

Timer API

Timer can be set up process timer, single time or periodic timing.

int Timer_create (clockid_t clockid, struct sigevent *restrict, EVP timer_t);
Create a timer.
CLOCKID specifies which clock base to create the timer.
EVP (sigevent) can specify which signal the kernel sends to the process after the timer expires, as well as the signal with the parameters; default is SIGALRM.
Timerid returns the ID number of the timer that was built.
In the signal processing function, you can obtain the current signal by the Siginfo_t.si_timerid by which timer expires. Tested, the maximum number of timer can be created and ulimit pending signals, not more than the number of pending signals.

int Timer_gettime (timer_t timerid, struct itimerspec *value);
Gets the timer's next expiration time.

int Timer_settime (timer_t timerid, int flags, const struct ITIMERSPEC *restrict value, struct Itimerspec *restrict) ;
Sets the timer expiration time and interval period.

int Timer_delete (timer_t timerid);
Remove the timer.

These system calls create a Posix_timer hrtimer that sends a signal to the process when it expires.

SummaryThe introduction of Hrtimer and Clockevent/source has contributed greatly to the real-time improvement of kernel, and the processing of clock is abstracted from the architecture code, which enhances the reusability of the code. It also has strong support for POSIX time/timer standards, improving the time processing accuracy and flexibility of user-space applications. If the application layer has any confusion in using these syscall, a direct look at Hrtimer's code can be a great help in dealing with the problem and understanding the behavior of the OS.

Resources:
[1] Http://tglx.de/projects/hrtimers/ols2006-hrtimers.pdf
[2] Http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf
[3] Documentation/hrtimers/highres.txt
[4] Documentation/hrtimers/hrtimers.txt
[5] http://sourceforge.net/projects/high-res-timers/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More