To improve real-time performance, design and optimize Microsoft Windows CE. Net (on)

Last Update:2018-12-04 Source: Internet

Author: User

Tags sleep function

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

Windows CE. net
For Windows CE 3.0
More priority levels
More control over time and Scheduling
Timer interruption
Oemidle Function
Thread time slice
Change the method for handling priority inversion
Interrupt handling and nested interrupt
Nested interrupt
Interruption lagging time
ISR lagging time
Ist lagging time

To improve real-time performance, design and optimize Microsoft Windows CE. Net (on)

Summary: This article describes in detail the changes made to the Microsoft Windows CE operating system (OS) designed to enhance real-time performance features. It also discusses tools that can be used to test real-time performance and provides representative real-time performance test results for specific hardware configurations.

Content on this page

	Introduction
	Changes to the kernel
	Real-time measurement tools
	Performance measurement
	Summary

Introduction

For high-performance embedded applications that require strict time response, real-time performance is indispensable. For example, telecommunications exchange equipment, industrial automation and control systems, medical monitoring equipment and spatial navigation and guidance systems. Such applications must pass their responses in real time within the specified time parameter.

What is real-time performance? For Microsoft Windows CE. Net OS, the following list defines real-time performance:

•	The upper limit of the guarantee for high-priority thread scheduling-only for the highest-priority threads in all scheduling threads.
•	The upper limit of latency assurance during scheduling of high-priority interrupt service routines (ISR. There is very little room for the kernel after the preemption mechanism is disabled for a short and limited period of time.
•	Detailed control over the scheduler and how it schedules threads.

It is important to distinguish between real-time systems and real-time OS (RTOs ). A real-time system consists of all elements (hardware, OS, and applications) that meet system requirements. RTOS is only an element of a complete real-time system. It must provide sufficient functionality to make all real-time systems meet its requirements.

Although earlier Windows CE versions provide some RTOS functions, many important kernel changes since Windows CE 3.0 greatly enhance real-time performance. The Windows CE. Net kernel includes the same real-time enhancement features as Windows CE 3.0, in addition to some additional features. This article describes the following changes as part of Windows CE. NET and earlier versions:

Windows CE. net

Added the function of specifying the page pool size through oem-Defined variables for the X86 platform.

For Windows CE 3.0

•	Increases the number of thread priority levels (from 8 to 256 ).
•	More control over time and scheduling. Applications can control the time provided to each thread and manipulate the scheduler that benefits them. Currently, for the application programming interfaces (APIS) related to sleep and wait, the timer is accurate to one millisecond.
•	The method for handling priority inversion is improved.
•	Fully supports nested interruptions.
•	ISR and interrupt service thread (IST) lagging time is reduced.
•	More fine-grained memory management control.

In addition, this article describes the tool used to test the Real-Time Kernel performance, and provides real-time performance testing results on three different CPUs.

Changes to the kernel

The kernel is the internal core of Windows ce OS. It schedules and synchronizes threads, handles exceptions and interruptions, loads applications, and manages virtual memory. In Windows CE 3.0, the kernel has undergone the following changes to improve performance and reduce latency:

•	Move all kernel data structures to the physical memory, which greatly avoids the loss of the conversion backup buffer (TLB) when non-preemptible code is executed in the kernel.
•	All non-preemptible, but interruptible kernel parts (called kcall) are divided into smaller non-preemptible segments. The increase in the number of segments introduces some complexity, but the preemption mechanism can be disabled in a shorter period of time.

This section describes further kernel changes to enhance the real-time performance of Windows CE 3.0.

More priority levels

The scheduler of the kernel first runs a thread with a higher priority level, and then runs multiple threads cyclically with the same priority. Assigning a priority level to a thread is a way to manage the execution speed.

Windows CE 3.0 increases the number of priority levels available for threads from 8 to 255, and 0 is the highest priority, and is the lowest priority. The priority levels of previous versions of Windows CE 0 to 7 correspond to Windows CE 3.0 medium level 248 to 255. More priority levels allow developers to more flexibly control Embedded System Scheduling and prevent random applications from reducing system performance due to limited priority levels.

To assign these new priorities, Windows CE 3.0 introduces two new functions:CesetthreadpriorityAndCegetthreadpriority. New functions and Windows CE 2.12SetthreadpriorityAndGetthreadpriorityThe function looks exactly the same, but the new function accepts numbers ranging from 0 to 255.

More control over time and Scheduling

Windows CE 3.0 has improved timer performance. The precision of timer and sleep function calls reaches one millisecond, and applications can set time slices for each thread.

A timer (or system clock) is a rate at which the OS generates a timer interrupt and provides services to it. Previously, the timer was also a thread time slice, which is the maximum time that a thread can run in the system without being preemptible. In Windows CE 3.0, the timer is no longer directly related to the thread time slice.

Previously, the OEM set the timer and time slice as constants in the OEM adaptation layer (oal) to about 25 milliseconds. When a timer is triggered, if a thread is ready, the kernel schedules the new thread. In Windows CE 3.0, the timer is always set to a millisecond, and a time slice can be set for each thread.

By changing the timer from the number defined by the OEM to a millisecond, the application can be executedSleep (1)Function, and the expected precision is about one millisecond. Of course, this depends on the thread priority, the priority of other threads, and whether ISR is running. Previously,Sleep (1)Returns after a system cycle, which means that if the timer is set to 25 millisecondsSleep (1)ActuallySleep (25).

Timer interruption

Now the kernel has several new variables that developers can use to determine whether the system clock needs to be rescheduled. By returning the sysintr_nop flag instead of the sysintr_resched flag when appropriate, the fully implemented system clock ISR can prevent the kernel from being rescheduled. NK. Lib:

•	DwpreemptIs the number of milliseconds before the thread is preemptible.
•	DwsleepminIt is the number of milliseconds before the first timeout (if any) expires, and you need to reschedule it.
•	TicksleftThe number of system clocks processed by the sleep queue that has passed but has not been processed by the scheduler. Therefore, a non-zero value will cause rescheduling.

In timer ISR, other logics optimize the scheduler and prevent the kernel from executing unnecessary work, as shown in the following code example.

if (ticksleft || (dwSleepMin && (DiffMSec >= dwSleepMin)) || (dwPreempt &&       (DiffMSec >= dwPreempt))) return SYSINTR_RESCHED; return SYSINTR_NOP;

Oemidle Function

OEM implementationOemidleFunction. The kernel calls this function when there is no thread to schedule. In earlier versions, the timer clock forces the OS out of Idle State and returns to the kernel to determine whether the thread is ready for scheduling. If no thread is ready, the kernel calls it again.Oemidle. This operation will activate the kernel once every 25 milliseconds (or other time slice length specified by the OEM) to determine whether there are still no threads to schedule. On Battery-powered devices, this operation consumes valuable battery power.

In Windows CE 3.0, to reduce power consumption when the clock frequency is high,OemidleThe function allows the CPU to enter the standby mode for more than one millisecond. OEM's useDwsleepminAndDiffmsecVariable to program the system clock timer to wake up after the first available timeout.DiffmsecSinceTimercallbackThe function retrieves the current millisecond value since the last interval.

The maximum timeout value of the hardware timer may be lessMax_dwordThe maximum wait time of the timer can be programmed. In all cases, when the system returns from the idle status,OemidleThe function must be updated in milliseconds.CurmsecAndDiffmsec.CurmsecIs the current value of the interval, namely, the number of milliseconds since the start.

Thread time slice

In Windows CE 3.0, the thread time slice is flexible enough for applications to set the time slice one by one. This allows developers to adapt the scheduler to meet the current needs of the application. To adjust the time slice, two new functions have been added:CegetthreadquantumAndCesetthreadquantum. This change allows the application to set the time slice of the thread based on the amount of time required for the thread to complete the task. By setting the time slice of any thread to zero, the cyclic scheduling algorithm can be changed to the "run to finish" algorithm. Only threads with a higher priority or hardware interruptions can be executed before the threads set to run to finished.

The default time slice is 100 ms, but in the OEM initialization phase, the OEM canDwdefaultthreadquantumSet to any value greater than zero to overwrite the default value of the system.

Change the method for handling priority inversion

To help shorten the response time, Windows CE 3.0 changed its priority inversion method. When a low-priority thread has a kernel object required by a higher-priority thread, the priority inversion occurs. Windows CE uses Priority Inheritance to handle priority inversion. In this case, the thread of the blocked Kernel Object required by a thread with a higher priority inherits a higher priority. Priority inversion enables lower-priority threads to run and releases resources for higher-priority threads. Previously, the kernel processed the entire inverted chain. Starting from Windows CE 3.0, the kernel ensures that only one level of depth is handled by putting the priority upside down.

There are two basic examples of priority inversion. The first case is simple. In this case, the priority inversion process does not change from Windows CE 2.12 to Windows CE 3.0. For example, you can see this situation when there are three running threads. The priority of thread a is 1, and that of thread B and C is low. If thread a is running and A is blocked because thread B owns the kernel object required by thread A, the priority of thread B is increased to the priority level of thread, to allow thread B to run. Then, if thread B is blocked because thread C has the kernel object required by thread B, the priority of thread C will be increased to the priority level of a so that thread C can run as well.

The second and more interesting case is that thread a can run at a higher priority than thread B and C. Thread B has the kernel object required by thread A, and thread B is blocked, wait for C to release the required kernel object while C is running. In Windows CE 2.12, when a runs and is blocked due to B, the priority of B and C increases to a so that they can run. In Windows CE 3.0, when a is blocked due to B, the priority of only thread B is increased. By reducing complexity and modifying algorithms, the maximum kcall in Windows CE is greatly reduced and restricted.

Interrupt handling and nested interrupt

Real-time applications use interruptions as a way to ensure that the OS quickly pays attention to external events. In Windows CE, the kernel and OAL are adjusted to optimize interrupt transfer and event scheduling for the rest of the system. Windows CE splits the interrupt processing into the following two steps to balance the performance with the ease of implementation: the interrupt service routine (ISR) and the interrupt service thread (IST ).

Each hardware interrupt request line (IRQ) is related to an ISR. When interruption is allowed and an interruption occurs, the kernel will call the registration ISR of the interruption. As part of the Interrupt Processing Kernel Mode, ISR will be kept as short as possible. Its primary responsibility is to guide the kernel to start an appropriate ist.

ISR performs the lowest level of processing and returns the interrupt identifier to the kernel. The interrupt identifier returned by the kernel check and the related events that link the isr to the ist. Ist waits for this event. When the kernel sets the event, if ist is the highest priority thread to be run, ist will stop waiting and start to execute other Interrupt Processing for it. Most interrupt handling actually occurs within the ist.

Nested interrupt

In versions earlier than Windows CE 3.0, when ISR is running, all other interruptions will be disabled. This will make the kernel unable to handle any other interruptions until an ISR has been completed. Therefore, if the high-priority interrupt is ready, the kernel will not process the new interrupt until the current ISR has completed the operation and returns to the kernel.

To prevent high-priority interrupt loss and latency, Windows CE 3.0 adds support for nested interrupt based on priority (if supported by CPU or other related hardware ). When running ISR in Windows CE 3.0, the kernel will run the specified ISR as before, but the ISR with the same and lower priority is disabled. If the ISR with a higher priority is ready for running, the kernel will save the status of the running ISR and run the ISR with a higher priority. The kernel can nest the maximum number of Isr supported by the CPU. ISR is nested in the order of hardware priority.

In most cases, the current ISR code of the OEM is not changed because the kernel processes the details. If OEMs share global variables between ISRs, changes may need to be made. However, ISR generally does not know that they have been interrupted by ISR with a higher priority. If the ISR performs operations on a regular basis, an obvious delay may occur, but the delay only occurs when a higher priority IRQ is triggered.

After an ISR with the highest priority ends, any pending ISR with the lower priority will be executed. Then, the kernel continues to process any interrupted kcall. If the thread is being scheduled and interrupted in the middle of kcall, the scheduler will continue to process the thread. This will enable the kernel to continue execution from where it stops, instead of completely restarting the thread scheduling, thus saving valuable time. Once the pending kcall completes the operation, the kernel will re-schedule the thread's execution and start to execute the thread with the highest priority.

Interruption lagging time

One of the most important features of kernel real-time performance is the ability to serve IRQ within a specified period of time. The interruption delay time mainly refers to the time that the software interrupt processing lags behind, namely, the time from the external interrupt to the processor until the Interrupt Processing starts.

If no paging operation is performed, the lag time of the Windows CE 3.0 interrupt is limited to the threads locked in the memory. In this way, we can calculate the lag time in the worst case from to the overall start time of ISR and to ist. Then, the total time before the interrupt is processed can be determined by calculating the time required in ISR and ist.

ISR lagging time

ISR lagging time is the time from when IRQ is set in the CPU to when ISR starts running. The following three time-related variables affect ISR startup:

•	AThe maximum duration of interruption in the kernel. The kernel rarely shuts down the interrupt, but if they are closed, the duration of the shutdown will be limited.
•	BThe time between the kernel scheduling interruption and the actual invocation of Isr. The kernel uses this time to determine what ISR to run and save any registers that must be saved before continuing.
•	CThe time between the time when ISR returns to the kernel and the time when the kernel actually stops processing the interrupt. This is the time for the kernel to complete ISR operations by restoring any State (such as registers) saved before the ISR is called.

The startup time of the ISR being measured can be calculated based on the current state of other interruptions in the system. If the interruption is in progress, the start time of the new ISR to be measured must take two factors into account: the number of high-priority interruptions that will occur after the interruption has occurred, and the time it takes to execute ISR. The following example shows the start time.

Start of Isr =

Here,NisrIs the number of high-priority interruptions that will occur after an interruption of interest has occurred;Tisr (N)Is the time required to execute ISR. Figure 1 below illustrates the formula.

Figure1.Graphical representation of Isr startup time Formula

If there is no high-priority interrupt (Nisr = 0), the previous formula will be simplified to the following code example.

Start of Isr = a + B

Both Windows CE and OEM will affect the execution time of Isr. Windows CE controls variables A, B, and C, both of which are restricted. OEMs control Nisr and tisr (N), both of which can greatly affect ISR lagging times.

Ist lagging time

The ist lagging time is the time from ISR execution (that is, notification thread) to ist execution. The following four time-related variables affect the startup time of IST:

•	BThe time between the kernel scheduling interruption and the actual invocation of Isr. The kernel uses this time to determine what ISR to run and save any registers that must be saved before continuing.
•	CThe time between the time when ISR returns to the kernel and the time when the kernel actually stops processing the interrupt. This is the time for the kernel to complete ISR operations by restoring any State (such as registers) saved before the ISR is called.
•	LThe longest time in kcall.
•	MThe time of the scheduling thread.

The start time of the highest priority ist after ISR returns to the kernel and the kernel executes some work to start executing the ist. After the ISR returns and notifies the IST to start running, the IST start time is affected by the total time of all ISR. The following example shows the start time.

Start of highest priority ist =

This formula is described.

Figure2.Graphical representation of the maximum priority ist start time Formula

Both Windows CE and OEM will affect the time required to execute the ist. Windows CE controls variables B, C, L, and m, which are restricted. OEMs control Nisr and tisr (n), which can greatly affect the IST lag time.

Windows CE 3.0 also adds the following restrictions to ist: event processing linking ISR and ist can only be used inWaitforsingleobjectFunction. Windows CE 3.0 prevents ISR-IST event processing from being usedWaitformultipleobjectsFunction, which means that the kernel can guarantee that there is an upper limit between the event trigger time and the IST release time.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More