Overview of real-time preemptible patches (to be continued)

Source: Internet
Author: User

A realtime preemption Overview (2005-08-10/Paul mckenney)
Overview of real-time preemptible Patches

Yang honggang <eagle.rtlinux@gmail.com>
Ref: http://lwn.net/Articles/146861/
----------------------------------------

//// Preempt_rt

The core of the preempt_rt patch is to minimize the amount of code that cannot be preemptible in the (Linux) kernel.
To support preemption, the amount of code that must be modified is minimized.

Code sequences such as critical sections, interrupt processing functions, and Guanzhong disconnections are usually preemptible and improved.

Preempt_rt patches can be preemptible and improved using the SMP feature of the Linux kernel, thus avoiding
Rewrite the entire Linux code.

To some extent, we can simply think that preemption is to add a new CPU to the system,
Then, the general locking mechanism is used to synchronize the preemption task.

Do not give up the above descriptions. For example, preempt_rt does not generate
CPU Hot swapping events. The key is that the underlying SMP environment must provide a free preemption mechanism.
The following sections show how the preempt_rt idea is implemented.

//// Preempt_rt features

1. The critical section can be preemptible.
2. the interrupt processing function can be preemptible.
3. "Guanzhong disconnected" code sequence can be preemptible
4. The spinlock and semaphore in the kernel support priority inheritance.
5. Delayed operations
6. Measures to reduce latency

The following sections describe:

/// 1. The critical section can be preemptible.

In the PREEMPT-RT, the general spinlock (spinlock_t and rwlock_t), the RCU "read part" critical section
(Rcu_read_lock () and rcu_read_unlock () are both preemptible.
The semaphore critical section can also be preemptible (the same is true for general kernels without preempt_rt patches ).
This preemption means that it can also be blocked when getting the spinlock. In turn, when the interruption or preemption is disabled,
You should not apply for a spinlock. This also means that no
Disable hardware interruption.

Test #1: In general kernels, how does one support semaphore critical section preemption?

What should I do when I need to apply for a lock under conditions of interruption or preemption and shutdown? You can use raw_spinlock_t.
Instead of spinlock_t. In raw_spinlock_t, the system calls spin_lock ().
Preempt_rt introduces a series of macros to make the spin_lock () behave like the overload in C ++. When
During the call, it is represented as a traditional spinlock. When called in the spinlock_t, its critical section can be preemptible.
For example, multiple _ IRQ primitives (such as spin_lock_irqsave () are used in raw_spinlock_t and hardware interruption is disabled.
However, hardware interruption is not disabled when it is used in the spinlock_t. However, for raw_spinlock_t (and the corresponding rwlock_t
, Raw_rwlock_t) does not follow this rule. Only a few underlying components, such as scheduler, platform-related code, and RCU
These raw locks are required and are not available elsewhere.

Because the critical section can be preemptible, you cannot expect the given critical section to be executed only on a fixed CPU. For preemption reasons,
It may be migrated to another CPU to run. In this way, when the per-CPU variable is used in the critical section
Handle problems arising from possible preemption separately. Because spinlock_t and rwlock_t won't handle these tasks.
Feasible methods include:
1. explicitly disable preemption through get_cpu_var (), preempt_disable (), or disable hardware interruption.
2. Use the per-CPU lock to protect per-CPU variables. One way is to use define_per_cpu_locked (), more
.

Because spin_lock () can be sleep now, additional task statuses need to be added. See what Ingo Molnar provides.
Code snippet:

Spin_lock (& mylock1 );
Current-> state = task_uninterruptible;
Spin_lock (& mylock2); // [*]
Blah ();
Spin_unlock (& mylock2 );
Spin_unlock (& mylock1 );

Because the spin_lock at [*] may sleep, this will damage the value of current-> state. This should not be the case for blah ().
Occurred. Therefore, the task_running_mutex bit is introduced to instruct the scheduler to run the current-> state value before scheduling.
Protection. Although this is a bit strange, it does support preemptible critical zones with the minimum amount of code modification.
In addition, this allows the same code to work in the preempt_rt, preempt, and non-preempt configurations.

/// Preemptible interrupt handling program

In the preempt_rt environment, almost all interrupt handler functions run in the process context. Although any interruption can be marked
Sa_nodelay enables it to run in the interrupt context. Currently, only fpu_irq, irq0, irq2, and lpptest interrupt are configured with sa_nodelay.
Mark. Among these interruptions, only irq0 (Per-CPU timer interrupt) is commonly used, fpu_irq is used for floating point coprocessor interrupt, lpptest
Used for evaluation of interrupt latency. Note: functions such as the Software Clock (add_timer () are not running in the interrupt context, but running in
Process context can be fully preemptible.

Do not use sa_nodelay easily. It will greatly increase the system interruption and scheduling latency. Per-CPU Timer
Sa_nodelay is used because it is closely related to scheduling and other kernel core components. In addition, you must proceed with caution.
The interrupt handler function marked as sa_nodelay. Otherwise, oops and deadlock may occur. It will be introduced in later chapters.

Because the per-CPU clock interruption (such as scheduler_tick () runs in the hardware interrupt context, any process context is shared
The lock must be raw spinlock (raw_spinlock_t/raw_rwlock_t ). When you need to apply for a spinlock in the process
The _ IRQ variant function must be used. For example, spin_lock_irqsave (). In addition, when accessing and marking in the process context
When sa_nodelay interrupt handler functions share the per-CPU variable, hardware interruption is usually disabled. Next, we will introduce it in detail.

/// A preemptible "Guanzhong disconnection" code sequence

At first glance, the code that can be preemptible is a bit in conflict, but it does not conflict with the core idea of preempt_rt.
Its core idea is to solve the competition between interrupt processing functions based on the SMP capability of Linux kernel. Don't forget,
Almost all interrupt handler functions run in the process context. Any code that interacts with the interrupt processing function must be ready at all times.
It is scheduled to run on another CPU.

Therefore, related primitives such as spin_lock_irqsave () do not need to be preemptible. This is safe because if
The interrupt handler preemptible the code with a spinlock_t to start running. However, once you try to obtain the spinlock_t, it will be blocked.
In this way, the critical section is still protected.

Because there is no lock that can be depended on, local_irq_save () will disable preemption. Use the lock instead of local_irq_save ().
Reduce the scheduling latency, but it will reduce the SMP performance, so be careful.

The code that must interrupt interaction with the Code marked as sa_nodelay should use raw_local_irq_save () instead of local_irq_save ().
Because local_irq_save () does not disable hardware interruption. Similarly, when you need to interact with the Code marked as sa_nodelay,
Raw spinlock (raw_spinlock_t, raw_rwlock_t, and raw_seqlock_t) should be used ). However, it should not be at a low level
Use raw spinlock outside the domain (such as scheduler, architecture-related code, and RCU.

//// The spinlocks and semaphores in the kernel support priority inheritance.

Programmers who design real-time programs are usually concerned with priority flip issues. In the following cases, priority flip occurs:
* Low-priority task a obtains a resource, such as a lock (l ).
* Priority task B starts execution and preemptible task.
* High-priority task C tries to obtain resource L. Because Task B with the medium priority preemptible task a (task a cannot release the lock L)
A high-priority task is blocked.

A priority flip may cause a high-priority task to be postponed indefinitely. There are usually two ways to solve this problem:
(1) preemption prohibited
(2) Priority Inheritance
Task B cannot seize task a because it is not preemptible in method 1. This avoids the occurrence of priority flip.
This method is used in the spinlock of the preempt kernel, but not in the semaphore of the preempt kernel.
Because blocking is legal when semaphore is held, priority is reversed even if it is not preemptible.
In this case, it is meaningless to disable preemption. For some real-time tasks, disabling preemption will introduce significant scheduling latency,
Therefore, it is not allowed to be preemptible even in the spinlock.

Priority Inheritance is used when preemptible is disabled. The core idea is: the priority of a high-priority task is temporarily given to it as critical.
Low-priority tasks of resources (locks. Here, priority inheritance is changed: for example, another task with a higher priority D also tries
Obtain the lock l, so the priority of task C and A will be temporarily upgraded to the priority of task D. The duration of priority inheritance is very
For a short time. Because once the lock is released for low-priority task a, it will immediately lose its priority for a short upgrade and then hand over the lock
Task C.

However, it may take some time for task C to run. Because it is very likely that another task with higher priority e will try at the same time
Obtain the lock L. Then, task e will "steal" the lock from task C's hand ". This is legal because task C is neither running nor
The lock L is actually obtained. Another case is that task C has started running before task E tries to obtain the lock l, so task e will no longer be able
"Steal" lock L. Task e must wait for task C to release the lock L, which may temporarily increase the priority of task C and speed up its operation
Release the lock.

In addition, in many cases, the task will hold the lock for a long time. If other tasks require the lock, you can add a "preemption point" to enable the lock owner to take the initiative.
Discard the lock. The jbd (Journal block device) layer contains a large number of such examples.

Preempt_rt simplifies the problem by allowing only one task to read and hold the reader-writer lock/semaphore for a period of time. Allow
The task recursively acquires the lock. Although some flexibility is lost, priority inheritance becomes feasible.

Quick test #2: How to easily and quickly implement priority inheritance from writers to multiple readers?

In some cases, semaphore does not require priority inheritance, for example:
When semaphore is used as an event mechanism rather than a lock (before an event occurs, we don't know who will issue it, so
Cannot increase its priority ). In these cases, you can use the compat_semaphore and compat_rw_semaphore variants.
Multiple semaphore primitives (up (), down (), etc.) can be used either compat_semaphore or semaphore.
Similar to the reader-writer semaphore primitive (up_read (), down_write (), it can be used either compat_rw_semaphore or
Used for rw_semaphore. However, the completion mechanism is usually a good choice for solving such problems.

To sum up, Priority Inheritance enables high-priority tasks to obtain the lock and semaphore in a timely manner, even if the lock or semaphore
It has been obtained by a low-priority task. Preempt_rt's priority inheritance provides a short inheritance, which is a high-priority task suddenly
Required to obtain the lock for a low-priority task. Compat_semaphore and compat_rw_semaphore can be used without the semaphore priority.
The usage of the inherited event class.

/// Delayed operation

Because spin_lock () can be sleep, it is illegal to call it when the preemption/interruption is disabled. In some cases,
The application for spin_lock () is delayed until the preemption is re-enabled:
* Put_task_struct_delayed () manages put_task_struct () in the queue
The application for spinlock_t alloc_lock is valid.
* Similar to put_task_struct_delayed (), mmdrop_delayed () manages mmdrop () by queue.
* The tif_need_resched_delayed flag can be used for rescheduling, but the scheduling will be delayed until the process is ready to return to the user space,
Or the next preempt_check_resched_delayed (). The key point of both is to avoid unnecessary preemption (high-priority tasks that will be awakened
Will wait for the current task to release a lock ). If the tif_need_resched_delayed flag is not specified
Immediately preemptible low-priority tasks, but will soon block the application for locks of low-priority tasks.

Solution: Replace the wake_up () followed by the spin_unlock () with wake_up_process_sync ().
In this way, if the process to be awakened will seize the current process, the wake-up operation will mark the delay by specifying tif_need_resched_delayed.

In all the above cases, the solution is to delay an action until it can be executed more securely and conveniently.

//// Measure to reduce latency

The main reason for some changes to preemtp_rt is to reduce the scheduling/interruption delay.
X86 MMX/SSE hardware is a column. This hardware is used when the kernel space is preemptible and disabled. This means that,
The preemption function is enabled only after the MMX/SSE command is run. Some MMX/SSE commands are fine, but some commands
Execution takes a long time. The preempt_rt solution does not use slow MMX/SSE commands.

Another modification is: applying for a per-CPU variable from the slab distributor is also a solution to stop interruption.

///// Overview of basic preempt_rt tools

This section briefly introduces many kernel facilities added in preempt_rt or changed by preemt_rt.
/// Lock
* Spinlock_t
The critical section can be preemptible. _ IRQ operations (for example, spin_lock_irqsave () do not disable hardware interruption. Priority Inheritance is used
Resolves priority flip. In preempt_rt, The spinlock_t is implemented using rt_mutex (similarly, rwlock_t, struct semaphore,
And struct rw_semaphore ).
* Raw_spinlock_t
The special variant of the spinlock_t provides the traditional spin lock function. When used, the critical section cannot be preemptible, And the _ IRQ operation will
Disable hardware interruption. It should be noted that you should usually use the usual lock (for example, spin_lock () instead of raw_spinlock_t.
In addition to the Code related to the architecture or underlying scheduling and synchronization facilities, raw_spinlock_t should not be used.
* Rwlock_t
The critical section can be preemptible. _ IRQ operations (such as write_lock_irqsave () do not disable hardware interruption. Use priority inheritance to solve
Priority flip problem. To simplify the implementation of priority inheritance, only one task can read and hold a given rwlock_t at a time,
This task can recursively read and hold the lock.
* Rw_lock_unlocked (mylock)
This macro only has one mylock parameter, which is required for Priority Inheritance operations. Unfortunately, this is in preempt_rt and non-preemp_rt
The kernel is incompatible. Therefore, use define_rwlock () instead of the current macro.
* Raw_rwlock_t
The special variant of rwlock_t provides traditional behavior. The non-preemptible _ IRQ operation in the critical section will actually disable hardware interruption. Similar
Similar to raw_spinlock_t. In addition to the architecture-related code or underlying scheduling and synchronization facilities, raw_rwlock_t should not be used.
* Seqlock_t
The critical section can be preemptible. The updated end uses Priority Inheritance (the read end cannot participate in priority inheritance because the reader of seqlock_t cannot block the writer ).
* Seqlock_unlocked (name)
Declare_seqlock () should be used ().
* Struct semaphore
Now, priority inheritance is supported.
* Down_trylock ()
It can be scheduled and cannot be called when hardware interruption or preemption is prohibited. However, since almost all interrupts run in the process context, and
Preemption and interruption are allowed, so there is little impact.
* Struct compat_semaphore
Struct semaphore variants do not support priority inheritance. This is useful when you need an event mechanism instead of a sleep lock.
* Struct rw_semaphore
Priority Inheritance is supported. Only one task can read and hold the rw_semaphore at a time. The task can read and hold the lock recursively.
* Struct compat_rw_semaphore
Struct rw_semaphore variants do not support priority inheritance. This is useful when you need an event mechanism instead of a sleep lock.
Test 3: Why does the event mechanism not use priority inheritance?
/// Per-CPU variable
/// Interrupt handling function
/// Others

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.