Priority reversal in embedded real-time systems
Author: Liu Hui Meng fanrong Xi jingke Source: MCU and Embedded System Application updated on:
Brief description:In embedded real-time systems, due to multi-task resource sharing, there are usually some strange phenomena. This article analyzes what is priority reversal and its causes, and puts forward two effective solutions.
1. Question proposal
Currently, commercial RTOS with a high market share include VxWorks/PSOs, QNX, lynxos, vrtx, and Windows CE. The vast majority of these RTOS are multi-task real-time micro-kernel structures, using priority-based preemptible scheduling policies. The system assigns a priority to each task. The scheduler ensures that the currently running process has the highest priority. However, sometimes a strange phenomenon occurs: Because multi-process resources are shared, processes with the highest priority are blocked by low-priority processes, instead, a medium-priority process is executed before a high-priority process, causing a system crash. This is the so-called priority inversion ).
2 reverse priority
RTOS has two features: real-time performance and multi-task performance. Real-time means that the system's response time must be within the specified time. exceeding this time limit will cause a fatal error to the system. At the same time, real-time performance also requires that tasks that require time must be executed before tasks that are not urgent. For these two reasons, RTOS generally adopts a priority-based preemptive pbp scheduling policy. Multi-task is an internal requirement of embedded systems. Today's embedded systems generally require multi-task concurrent execution capabilities, so RTOS must also provide support for multi-task concurrent execution. Due to multi-task concurrency, multiple tasks will inevitably share resources. If two tasks Task 1 and Task 2 are executed concurrently, the results must be output to the printer. Because there is only one printer, there can be only one task in a certain period of time. For example, task1 occupies the printer and outputs it to the printer, and task 2 is in the waiting status. After Task 1 is output, Task 2 is changed from waiting to ready. When RTOS schedules it again, the printer can output the data to the printer. Imagine if we do not use this method to control the printer's shared resources, and let task1 and task2 output to the printer at the same time, no one can understand the printer's printed results at this time, it's a bunch of messy things. Therefore, most RTOS adopt a mechanism called semaphore to manage shared resources. Any process that wants to use critical resources (such as printers and other shared resources) must have a semaphore that uses critical resources before entering the critical zone (such as the code used to access critical resources in task1 or task2, otherwise, the code in the critical section cannot be executed. Assume that the system has three tasks: task1, task2, and task3. Task 1 has a higher priority than Task 2, and task 2 has a higher priority than Task 3. Task 1 and Task 2 are blocked for some reason. At this time, the system schedules Task 3 for execution. After Task 3 is executed for a period of time, Task 1 is awakened. Because the pbp scheduling policy is adopted, Task 1 can seize the CPU of Task 3 and execute Task 1. After Task 1 is executed for a period of time, it enters the critical section. However, Task 3 occupies the semaphore of this critical resource. Therefore, Task 1 is blocked and in the waiting state, waiting for Task 3 to release this semaphore. After such a period of time, Task 2 is ready at this moment. Therefore, the system schedules Task 2 for execution. If Task 3 cannot be scheduled during Task 2 execution, Task 1 and Task 3 will not be executed until Task 2 is executed, task 1 can be executed only when Task 3 releases the semaphore it holds. During this time, it is possible that task 1's deadline is exceeded, causing Task 1 to crash. When the system sees a high-priority task crash, the system considers a major accident to occur at this time. To save the system, the watchdog circuit works and the system may be automatically reset. From the above analysis, we can see that the cause of system crash is that task 1 (a high-priority task) is blocked by Task 2 because it needs to obtain the critical resources occupied by Task 2 (a low-priority task, task 2 with a medium priority can seize the CPU of Task 3, leading to task 2 being executed before Task 1. At this time, the system will reverse the priority, as shown in 1.
3. Solution to priority reversal
Currently, there are many ways to reverse the priority. There are two commonly used methods: Priority Inheritance and priority ceilings ).
In the priority inheritance scheme, when a high-priority task is waiting for the semaphore occupied by a low-priority task, the low-priority task is given the priority of the high-priority task, that is, the priority of a low-priority task is increased to the priority of a high-priority task. When a low-priority task releases a semaphore waiting for a high-priority task, the priority is immediately reduced to the original priority. This method can effectively solve the priority reversal problem described above. When Task 1 of a high-priority task needs to enter the critical section, Task 1 is blocked because Task 3 occupies the semaphore of this critical resource. At this time, the system raises the priority of task3 to the priority of task1. At this time, the priority is in Task 2 between Task 1 and Task 3, and cannot be scheduled to run even if it is ready, task 3 is scheduled to run because Task 3 has a higher priority than Task 2. When task3 releases the semaphore required by Task 1, the system immediately drops the priority of Task 3 to the original height to ensure normal and orderly execution of Task 1 and Task 2. The overall situation 2 is shown. Currently, many RTOS adopt this method to prevent priority inversion, such as VxWorks, a well-known WinDriver company in the industry.
In the priority limit scheme, the system Associates each critical resource with one limit priority. This limit priority is equal to the system's highest priority plus 1 at this time. When a task enters the critical section, the system passes the maximum priority to the task so that the task has the highest priority. When the task exits the critical section, the system immediately restores its priority to normal, so that the system will not reverse the priority. In the above example, when task3 enters the critical section, it immediately raises its priority to the limit priority to ensure that task3 can exit the critical section as soon as possible and then release the semaphore it occupies. When Task 1 is executed in a high-priority task, it will not be blocked when it waits for the task 3 of the low-priority task to release the semaphore, so as to ensure that the priority inversion mentioned above will not occur. Another advantage of using this scheme is that multiple tasks can share the critical resource by changing the priority of a critical resource, as shown below.
Void TASKA (void ){
...
Settaskpriority (res_x_prio );
// Access Shared Resources X.
Settaskpriority (task_a_prio );
...
}
The above details the causes and solutions of priority reversal problems in RTOs. The 21st century will be the era of embedded systems. People engaged in embedded system design have a deep understanding of RTOS principles and internal potential problems, such as priority reversal, which will help develop more reliable products.