Improvement on Linux real-time performance

Source: Internet
Author: User

A real-time operating system is an operating system that ensures that specific functions are completed within a certain period of time. Its features are as follows:

1) timing accuracy of a high-precision timing system is an important factor affecting real-time performance. In real-time application systems, it is often necessary to accurately and accurately operate a device, execute a task, or accurately calculate a time function. These depend not only on the clock precision provided by some hardware, but also on the high-precision timing function implemented by the real-time operating system.

2) multi-level interrupt mechanism a real-time application system usually needs to handle a variety of external information or events, but the degree of urgency of handling is prioritized. Some must respond immediately, and some can be postponed. Therefore, it is necessary to establish a multi-level interrupt nested processing mechanism to ensure timely response and processing of highly urgent real-time events.

3) Real-time scheduling mechanism the real-time operating system must not only respond to real-time event interruptions in a timely manner, but also schedule and run real-time tasks in a timely manner. However, the Processor Scheduling cannot be performed as desired, because the switchover between two processes can only be performed at a time point to ensure "secure switching". The real-time scheduling mechanism includes two aspects, the first is to ensure priority for real-time tasks in scheduling policies and algorithms; the second is to establish more "secure switching" time points to ensure timely scheduling of real-time tasks

++ ++ ++ ++

An embedded system is a device that runs in a limited space and resources and efficiently implements a set of specific functions or functions. Its development is usually restricted by many objective conditions, such as weak CPU processing capabilities, smaller memory space, less available peripherals, and limited power supply. The development of each embedded system is not very careful, in order to use limited resources to maximize the effectiveness. Among operating systems running on various embedded systems, embedded Linux is gaining more and more attention with its free, high reliability, extensive hardware support, open source code, and many other features. The open source code feature allows developers to modify the Linux kernel for specific embedded systems to meet development requirements and achieve system optimization. A major problem in Embedded Linux applications is the real-time performance of Linux. Real-time systems must make a correct response to external events within a limited period of time, focusing on meeting sudden and temporary handling needs. Linux, as a traditional time-based operating system, focuses more on the overall data throughput of the system. How to improve the real-time performance of Linux is a challenge for many embedded system-level developers.

1 related research
At present, there are various Linux distributions on the market, but strictly speaking, Linux refers to the Linux kernel maintained by Linus Torvalds (and released through main and image websites. An embedded Linux system only represents a Linux kernel-based embedded system. Linux mentioned later in this article refer to the Linux kernel. Currently, a lot of work is being done to improve the real-time performance of Linux. The latest version of Linux 2.6 has implemented preemptible kernel task scheduling, but the uncertain interruption delay problem has not been solved. That is to say, in Linux 2.6, high-priority kernel space processes can seize system resources of low-priority processes as they do in user space, however, the time from the start of the interruption to the start of the execution of the 1st commands of the interrupt service program is uncertain.
In addition to improvements made by Linux developers, some organizations and companies have done a lot of work to improve the real-time performance of Linux. The representative projects are FSM labs's RT-Linux, Monta Vista's montavista Linux, and rtai (realtime application interface) maintained by Paolo mantegazza. These projects adopt two methods:
(1) directly modify the Linux kernel. This method is used in the Linux System of montavista. It modifies Linux to a preemptible kernel called relatively fully preemptable kernel, implements real-time scheduling and algorithms, and adds a fine-grained timer, in this way, Linux is changed to a soft real-time kernel.
(2) "dual-kernel" mode. The rtai project and RT-Linux use this method. In this way, the traditional Linux "overhead" is used as a new small real-time kernel with the lowest priority for task execution, while the real-time task is used as the highest priority. That is, if a real-time task exists, the implementation task is run. Otherwise, the Linux task is run.
The limitation of montavista and RT-Linux is that it is a commercial software and does not follow the GNU open source code principle. To use Linux in the system, you need to pay a considerable authorization fee, which violates the original intention of using Linux-open source, free, and capable of developing your own intellectual property rights.
Rtai discards many inherent Linux advantages for real-time performance: extensive support for a large number of hardware, excellent stability and reliability. On the one hand, developers need to re-compile the driver for a hardware abstraction layer (rthal) customized for rtai, moreover, the results of the huge Linux development community cannot be easily applied to the real-time core.

2. Factors affecting Linux real-time performance
2.1 Task Switching and latency
Task Switching latency is the time required for Linux to switch from one process to another, that is, the interval between the high-priority process from sending a CPU resource request to the execution of the 1st commands of the process. In real-time systems, the shorter the latency of task switching, the better. As mentioned earlier, Linux 2.6.x has implemented a preemptible kernel, high-priority kernel space processes can stop low-priority processes and execute themselves at any time, as in user space. However, there are two exceptions:
(1) processes executed in the critical section cannot be preemptible by other processes;
(2) interrupt service routine cannot be preemptible by other processes.

2.2 priority-based scheduling algorithm
In Linux 2.6, the O (1) scheduling algorithm is used. It is a priority-based preemptive scheduler that assigns a unique priority to each process. The scheduler ensures that all tasks are waiting for running, first, a task is executed with a high priority. Therefore, a task with a high priority can seize tasks with a low priority.
This scheduler has a constant overhead and has nothing to do with the current overhead, which can improve the real-time performance of the system. However, the scheduling system does not deprive the system of other resources except the CPU, and the real-time performance has not been fundamentally improved. If two tasks need to use the same resource (such as cache), the high-priority task is ready, and the low-priority task is using this resource at the moment, a high-priority task must wait until the resource is released after the low-priority task ends. This is called priority inversion.

2.3 Interrupt Delay and interrupt service procedures
Interruption delay refers to the interval from the time when the peripheral sends the interrupt signal to the execution of the first instruction of Isr. Real-time task requirements caused by external interruptions are the main components of the real-time system processing capacity. Interrupt response quickly enough and interrupt service program processing quickly is an important performance indicator to measure the real-time system. Different ISR execution times are different. Even the same ISR may have different execution times due to multiple exits. However, when an ISR is executed, external interruption is disabled. In this case, even if the Linux interrupt latency is very small, if a peripheral generates an interrupt signal during an ISR execution, because of the uncertainty and unavailability of the running ISR running time, the interruption delay of Linux is also unpredictable.

3. Real-Time System Performance Improvement
3.1 Creation of Task Switch and Failover
As mentioned in section 2.1, the process cannot be preemptible when executed in the critical section. We do not intend to modify the process to avoid affecting system stability and reduce debugging and testing time, A mechanism is introduced to ensure that real-time tasks are executed preferentially. That is, in a real-time system, access is allowed only when the critical section of a process can end before the next real-time task starts.
How to determine the generation time of the interrupt signal of the next real-time task. Generally, the interrupt signal is set for tasks with unpredictable start time. Its generation is completely random. To make the interrupt signal time predictable, the interrupt signal generation is linked to the clock interrupt: the interrupt signal can only be generated at the same time as the clock interrupt. Clock interruption is generated by the system timing hardware at periodic intervals. This interval is set by the kernel according to the Hz value. Hz is a constant related to the architecture, which is defined in the <Linux/param-h> file. In Linux, The Hz value defined for most platforms is 100, that is, the clock interruption period is 10 ms. Obviously, this does not meet the requirements of real-time system timing accuracy. Increasing the Hz value can improve the system performance, but at the cost of increasing the system overhead. This requires careful consideration of the balance between real-time requirements and system overhead. One way is to determine the interval between real-time task interrupt requests and the execution time of processes in the critical section through a large number of tests, take a value slightly greater than the interruption interval of most real-time tasks and the execution time of the critical section.
Linux provides some mechanisms for us to calculate the execution time of functions. The gettimefoday () function is one of them. The function prototype and a data structure are as follows:
Int gettimeofday (struct timeval * TV, struct timezone * tz );
Strut timeval {
Long TV _sec; // second
Long TV _usec; // microsecond };
Gettimeofday () stores the current time in the TV structure. TZ generally does not need to be used and can be replaced by null. Example:
Main ()
{Struct timeval start_time, end_time;
Float time_uesd;
Gettimeofday (& start_time, null );
Function_in_critical_setion ();
Gettimeofday (& end_time, null );
Time_used = 1000000 (start_time. TV _sec-end_time. TV _sec) + (start_time. TV _usec-end_time. TV _usec );
Time_used/= 1000000;
Exit (0 );}
In this way, the time consumed by the process in the critical section function_in_critical_section () is obtained for reference. When the Hz value is set to 2000, the system clock interruption period is 0.5 ms, and the accuracy is increased by 20 times.
1. As shown in figure 2, when a process enters the critical section, it compares its average execution time t (NP) and T (remain) values, when T (NP) A process is allowed to enter the critical section only when it is equal to or equal to T (remain). Otherwise, the process enters the work queue and waits for the next judgment.
 
This article attempts to use mathematical methods to analyze the improvement of real-time performance using this mechanism. First, a definition is given: a real-time task scheduled to be executed at the time t is postponed until the time t 'is executed, then t'-T is called the system latency, which is represented by LAT (OS. In general Linux, lat (OS) is as follows:
LAT (OS) = T (NP) + T (shed)
If the probability of T (NP) ≤ T (remain) is P at any time, the average LAT (OS) in common Linux is
Avlat (OS) = P [T (NP) + T (shed)] + (1-p) [t (NP) + 2 T (shed)]
After the above mechanism is introduced, the LAT (RT-OS) is fixed:
LAT (RT-OS) = T (shed)
The system delay changes before and after using this mechanism are as follows:
Delta = avlat (nor-OS)-lat (RT-OS) = T (NP) + (2-p) T (shed)
In a specific system, P is fixed, while in Linux 2.6, T (shed) is fixed after the O (1) algorithm is adopted. The conclusion can be drawn from the previous formula: in a system with a long process execution time in the critical section, the larger the average system delay decreases before and after the mechanism is introduced, the more obvious the real-time performance improvement of the system.

3.2 top priority
Describe the following scenario: A low-priority task L and a high-priority task need to occupy the same shared resource. A high-priority task is ready shortly after the low-priority task starts, after the shared resource is found to be in use, task H is suspended and the resource is released after task l ends. In this case, a priority task m that does not need the resource appears. The scheduler then executes the task M based on the priority principle. This further delays task H, as shown in 3. Even worse, if more similar tasks such as M0, M1, M2 ,..., it may cause task h to miss the critical period (critical deadline) and cause system crash.
In a less complex real-time system, you can use the top priority method to solve this problem. This scheme assigns a priority to each resource that may be shared, which is the priority of the process (resource_x_prio in the pseudo code below) that may use the highest priority of the resource ). The scheduler will pass the priority to the process that uses the resource. After the process ends, its priority (task_a_prio in the following pseudocode) will return to normal. In this way, task L is not preemptible by task m in the preceding scenario, and task H is always suspended. The sample code for top priority setting is as follows:
Void task_a (void)
{......
Set_task_priority (resource_x_prio );
...... // Accessing shared resource x
Set_task_priority (task_a_prio );......}
3.3 kernel threads
Service Interruption programs (ISR) cannot be preemptible. Once the CPU starts to execute ISR, it is impossible to execute other tasks unless the program ends. Linux uses a spin lock to exclusively occupy the CPU. ISR with a spin lock cannot enter sleep, and the interruption of the system is also completely disabled. The kernel thread is created and abolished by the kernel to execute a specified function. The kernel thread has its own kernel stack and can be called independently. We use the kernel thread to replace ISR, and use mutex to replace the spin lock. The kernel thread can enter sleep state without disabling external interruptions during execution. After the system receives the interrupt signal, it wakes up the corresponding kernel thread. The kernel thread continues to sleep after running the task instead of the original ISR. In this way, it is predictable to interrupt the Court, and it takes little time.
According to the test data of lynuxworks, on a PC of 1 GHz Pentium III, the average task response time of Linux 2.4 kernel is 1133us, and the average interrupt response time is 252us; the average response time of Linux 2.6 kernel is 132us, and the average interrupt response time is only 14us, which is an order of magnitude higher than that of Linux 2.4 kernel. On this basis, this method can further accelerate the response time of specific interruptions for specific systems and improve the real-time performance of application systems.

4 Summary and prospects
This article discusses how to improve the real-time performance of Linux based on Linux 2.6. Introduced in the real-time system, only processes that enter the critical section can be executed before the next real-time task starts. This ensures that real-time tasks are always executed preferentially; the top priority method is adopted to avoid the case of priority inversion. Replacing the interrupted service program with the kernel thread changes the situation where the interrupted service program cannot enter the sleep state during execution, in addition, external interruptions are not disabled during execution, which results in short and predictable system interruptions. The disadvantage of this method is that increasing the clock Interruption Frequency will increase the system overhead. In order to find a balance between real-time performance increase and system overhead increase, developers have to perform a lot of tests on the specific system. The specific analysis of the specific problem causes a discount on the applicability of this method. Linux is bound to be widely used in the field of embedded systems due to its free, powerful performance, and numerous tools. We should keep track of the Development of Linux at home and abroad in a timely manner, and accumulate development experience in this field to get out of our own path.

From http://www.dz3w.com/mcu/linux/0070702.html

++ ++

In addition, the Linux real-time performance is affected by the absence of high-precision clock timing. This can also be improved by modifying the kernel.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.