“
windows is not a real-time operating systemThis remark was repeatedly mentioned in the Ntdev forum. This problem usually occurs when someone tries to write a plugin for a device that is not compatible with Windows, such as a device that expects the software to respond within a short time slice. The real-time operating system is defined as being executed in a predictable manner when the minimum requirements are met. It must be strictly guaranteed to respond to requests within a short time slice. Can be divided into soft real-time and hard real-time environment. For soft real-time, the most important is the fast response, the loss of a single request will not cause overall failure. In a hard real-time environment, in contrast, no unexpected delay is allowed, and any loss of request means overall failure. Typically, this environment supports real-time tasks. in Windows, all requests to the operating system are processed based on best results, and no response time is guaranteed. As a result, WINDWOS can handle tens of thousands of requests in 1 seconds, simply because it does not ensure that each request is processed in a specified time slice. In most uses, Windows is designed and optimized based on performance and throughput, not for real-time tasks and low latency, which is usually conflicting. For most optimization scenarios that use the average throughput performance of interest, the worst result is a request loss, compared to a solution that focuses on the maximum response time for real-time requests. This means that each individual request or operation is important. has a lot of real-time request-related solutions ranging from industrial, medical, military, aerospace, research to high-precision multimedia applications. For a simple, real-time application, "real-time audio", such as a software synthesizer, needs to be audible in response to the keys of a MIDI keyboard, with a delay of no more than a few milliseconds. The audio stream must be contiguous, and any click, eject, and release can interrupt the audio stream and affect audio results. This means that all requests meet the cutoff time. In fact, latency issues in real-time applications, such as "keys before sounding" delay problems, have, instead of the product of the computer age. For example, the organist must be pre-read and heard in a few seconds before the sound is played, the air must flow through a complex mechanism, and the sound must be transmitted from the corner of the church to the ear. When using sound synthesizers and audio plugins, latency issues in real-time audio applications often have their implementation logic in user mode. This means that they are more likely to be interrupted (such as paging, scheduling) than device drivers running at IRQL (Interrupt request level interrupt requests). Finally, for the purposes of this article, we ignore specific Windows features, such as IRQL passive_level interrupt handling and umdf based on certain purposes, and are usually not designed for real-time processing.
Low-latency enemiesFor the sake of understanding why Windows is not the right real-time processing operating system, we look closely at an implementation of an interrupt-driven device that processes requests in user mode within the cutoff time. Then we explain which issues will cause unwelcome delays. For an interrupt-driven, real-time sensitive device, it is important that Windows respond to requests in a timely manner. For the real-time processing enemies that will be discussed next, the delay may be introduced in real-time processing. Logically, during the processing of an interrupt, the lower the IRQL drop, the more likely it is to cause long and frequent delays from ISR to DPC and back to user processing.
Interrupt/isrThe device indicates the method that needs to be noticed by the software, which generates an interrupt. After the CPU receives the interrupt signal from the device, it registers the interrupt as an ISR (Interrput Service Routine) to process. The execution of the ISR may be delayed by a factor. The CPU may temporarily delay the outage, the operating system is being used for an advanced interrupt service, temporarily deactivating some masked interrupts, causing the whole system to lock up or otherwise. Similarly, interrupts may be delayed by hardware factors. Unfortunately, such interruptions cannot be measured by software alone. Require hardware support, or use bus analytics "in modern Intel architecture systems, we know that hardware latency is very small, typically less than 1 milliseconds." In this article, we discuss only the software delay in interrupt processing, from the time the ISR begins processing to the period when the interrupt is processed by the software. Because Windows does not allow control over the priority of device interrupts, any ISR execution can be preempted by more advanced interrupts, unless such preemption is prevented by raising IRQL. Latency during ISR processing may also be introduced by other factors, such as operating system interrupts due to system clock, IPI (Inter Processor Interrupt) routines, or factors that are not controlled by the operating system, such as SMI (System Management Interrupt) routines, and a variety of other hardware factors that will be discussed later.
DPCIf the interrupted service takes a long time to run, the ISR usually plans a DPC (Deferred Procedure call). DPC handles I/O operations on a lower IRQL, and device and system interrupts can preempt his execution. If the interrupted service needs to be performed in user mode, it also requires a DPC, subject to the IRQL imposed on the ISR limit to not allow the recovery of waiting objects or to wake user mode threads. In general, DPC executes on the same processor immediately after the request. The operating system maintains a DPC queue for each logical processor, and it is likely that other DPC routines will run before you, increasing the latency when the DPC queue on your processor is busy. Because the DPC routine is executed in the IRQL dispatch_level (the thread DPC is an exception, it executes in passive_level), any device or hardware interrupt during the process of DPC can cause delays. The CPU receives an interrupt starting at the time interval that the DPC routine begins to execute, usually expressed as "ISR to DPC delay". If the driver requires a cut-off time, it needs to minimize the ISR to DPC maximum delay. This is the point, don't miss it: for real-time processing, we focus on the maximum delay, not the average delay. The average latency of ISR to DPC is usually very good. In other words, the maximum latency of ISR to DPC can sometimes be a surprising disadvantage. This is also where manual products are often introduced (see below). Execution by rule on the other hand, as a core developer, even though your driver doesn't have any real-time requirements, you need to focus on smooth execution and avoid spending too much time focusing on IRQL, so you don't compromise the real-time nature of the operating system you're driving. This includes the time spent in ISR, DPC, and self-looping locks in code execution, or IRQL raised by other means. Or you are a bios/firmware developer and need to pay more attention to these. The MSDN recommendation for this is that no DPC or ISR routines can execute more than 100 microseconds. In practice, many drivers violate this rule, at least on specific hardware. If a driver causes a high latency, this does not always mean that this is a software error, and in many cases such problems can be attributed entirely to the hardware. Network drivers (especially WiFi), storage port drivers, and ACPR battery drivers are notorious drivers that cause high latency during interrupt execution. Typically, these drivers are disabled in the system when performing real-time tasks.
User ModeIn real-time request processing mode, to avoid the execution of user-mode code. However, some scenarios and solutions depend on this. When user-mode code executes in an important way, it is obvious that the code must execute as quickly as possible. User-mode code should not suspend, wait, or invoke any Windows API library functions. It should only use resident memory, because hardware page faults can cause long hangs of the thread of execution, waiting for the page fault handler to resolve it by reading the synchronized data from the site, which typically takes a few seconds. Note Windows API functions, such as VirtualLock, allow only one assignment to the processor's working collection and do not provide the application with memory for resident RAM. One way to ensure that your application obtains resident memory is to have the driver lock the user-supplied cache into memory. There are many ways to implement it, and some methods are more complex than others. An easy way to do this is to use Method_out_direct in the IOCTL, allocate the cache by the user application, and then feed the driver as the output cache (Outbuffer). The driver can maintain this cache, and its locked memory, during execution until the application exits. There are also other more complex ways to lock shared memory between user mode and core mode. The pros and cons of these methods are not covered in this article. A user-mode thread that is part of an interrupt execution, typically managed in a real-time priority thread pool, is suspended in an idle state and is awakened by a signal from a dispatch object set in the DPC routine, such as an event or I/O end. The time period from which the processor receives an interrupt to the user-mode thread after it has been awakened, often referred to as "ISR-to-processor latency." Includes planning a DPC for the processor and executing the time. In the operating system course, this is often referred to as "processor allocation delay". A part that works in user mode has a time-required solution that must minimize the ISR-to-processor latency. When a thread of a real-time critical user pattern executes for a long time, the operating system planner allows it to be preempted by giving a "super" real-time priority to the user-mode thread so that its time does not expire. Includes a task object that assigns a specific scheduling class setting to the processor.
More ZombiesIn addition to hardware interrupts, ISR routines, DPC routines, user-mode execution, and Hardware page corruption (hard page faults), there are other low-latency enemies at the bottom that cannot be ignored because they can cause significant delays. Now let's discuss some of these issues. Inter Processor interrupts is an operating system startup interrupt that suspends all device interrupts when a particular routine executes on all logical processors. They can be triggered by hardware and software. Drivers, such as Ndis.sys, use this technology, along with a lot of DPC processing, which is why the operating system that enables network configuration is often not suitable for real-time processing. SMM (System Management mode) is a dedicated operating mode for processors that handle system-wide functions such as power management, system hardware management, or proprietary OEM code. Should only be used by the BIOS or firmware interface and cannot be used by applications or other operating systems. System Management Interrupts (systems Management Interrupt) are CPU interrupt routines executed in SMM outside of the operating system. The priority of the SMI is above any masked or non-blocking interrupts. Because SMI can interrupt any processing at any time, it needs to be executed as quickly as possible, otherwise the system will not be available when performing a real-time task. A CPU core with variable speed settings that can reach a very high temperature after a few milliseconds temporarily set to "Stop clock mode" needs to be cooled. Interrupts on this processor will be suspended until the CPU can run again. A design flaw in the CPU can also lead to an unexplained CPU outage. Some CPU defects in modern CPUs, which are related to processor C-states (CPU idle) and p-state (for CPU speed), can freeze a processor indefinitely until certain conditions are met. The CPU manufacturer publishes Errata (specification updates) that contain this information. Many of the factors that cause delay are not controlled by the developer at the software level, unless you modify the system configuration. However, it is possible to measure the real-time capability of the operating system through software so that the end user can check if his system is applicable.
Measurement DelayWe have listed a number of potential dangers for a driver with real-time requirements. If a driver wants to support a ready-made hardware with real-time requirements to run on any system, it will not succeed. Therefore, it is helpful to test the ability of a system before installation or acquisition to avoid failure and customer dissatisfaction, or to help end-users configure the system to overcome barriers to running your solution. There are tools that allow you to measure ISR and DPC time. One is Xperf, which is part of the optional installation of the Windows Performance Tool pack. The Windows Performance Toolkit, which is included in the Windows WDK and SDK. For information on how to use Xperf, click (Get low-collecting detailed performance Data with Xperf) Another tool isLatencymonProvides hardware page errors (hard page faults), in addition to ISR and DPC time. Different time-lapse measurement tools are also available, including ISR to DPC and ISR to user process delay. Latencymon is written by the author of this article. As the author of the driver code, it is negligible to add the measurement points by using the KeQueryPerformanceCounter function. To measure the execution time of your DPC routine, you can simply compare the differences between the start and end query performance counts in the routine. The change method can also measure the execution delay of ISR to DPC and ISR to user. A previously used technique for measuring DPC latency is to install a kernel cycle timer to measure the interval between exact interrupts. Windows timers are software-based and rely on the precision of clock interrupts. The clock of the operating system is a global resource, and some software components and applications compete for it, and only low-clock-cycle requests win (with the lowest-demand application winning). Because Windows timers do not have direct hardware support, they are not very precise. Even more so in Window8, because of the new feature "Dynamic clock Beats", the system clock is no longer interrupted after a definite interval, but is based on the power saving reason when the operating system deems it necessary. This makes any measurement method that relies on the kernel timer unreliable. If a device needs an accurate clock access in the software, the hardware needs a timer that can trigger the interrupt, allowing the software to deliver the service in a timely manner. This is true, although Windows8.1 has a new feature called High-precision timers. Starting with Windows Vista, Windows provides a series of functions and classes to retrieve event trace information collected by the operating system kernel. This allows you to obtain information about the ISR and DPC performed in the system, as well as hardware paging errors. Unfortunately, these classes do not support you in collecting the time consumption of advanced IRQL, based on reasons other than ISR and DPC, such as code that runs on spin locks and IPI. A method of measuring SMI maximum execution time and unexplained processor pauses is the most advanced interrupt to measure the tight loop of High_level polling in ISRQL. This speculative approach, however, does not measure the execution of software-initiated SMI routines. Similarly, you can measure the IPI execution time by performing loops at lower IRQL below Ipi_level. One thing to consider is that some versions of the operating system can use a IRQL management technique called "lazy IRQL". In this technique, the operating system stores IRQL values that do not always correspond to the processor's task priority state. Interrupt Binding in processingThe device. Windows allows full configuration to use which processor in your system executes threads, by setting preferences, and almost no processor handles device interrupt control. Processors that perform an ISR associated with a particular device are most relevant to the hardware, so Windows uses the BIOS settings for interrupt preferences. Some chipsets allow interrupts to propagate across all processors, while others only allow interrupts to be performed on CPU 0. This makes it useless for software running on specific hardware to run by choosing which processor to choose. This is different from the situation where you can provide your own operating system, which we will discuss later. The Getprocessorsystemcycletime API function is a quick and easy way to find out which processor in the system is handling interrupts and DPC.
Hardware ControlAs mentioned earlier, if you are developing real-time drivers for a device or solution, it may be possible to deploy on some systems that cannot meet the requirements because of specific system configurations. If you are fortunate enough to have rich control options on the system in which your solution runs, you may have the option to avoid failure. For developers who use off-the-shelf products, these options are not open. Depending on the deadlines and market requirements of your solution, choosing a specific operating system configuration allows you to submit a solution that ensures your work. Once the operating system is completely under your control, you have the opportunity to add additional hardware to support your latency-sensitive tasks. An option to implement real-time logic at the hardware level, such as using an FPGA or DSP Development Board. There are also windowns real-time (software) extensions that allow you to get real-time without the need for additional hardware, such asIntervalZero RTXAndTenasys Intime。 These solutions are responsible for the processing of real-time logic by running a separate operating system along with Windows. But although there is no external hardware or real-time expansion, there is still a lot to do, in case you control the entire operating system. If your solution requires a response time (or accuracy) that is not very high, say, more than 10 milliseconds, you can submit a solution that runs on Windows or meet the deadline by configuring a known hardware and driver that does not bring a high latency Windows operating system. By carefully selecting the master chipset and driver in the configuration, you can control which processor is connected to the interrupt, allowing you to reserve one or more processors for real-time mission-critical tasks, which effectively avoids the latency caused by ISR and DPC. Configuring which CPU handles ISR, DPC, and threads can go further through specific processor group parameters to start the operating system after testing. An important part of a custom configuration is power management. As discussed earlier, some CPU and BIOS power management features can cause delays in real-time processing. Disabling these features, if possible, avoids improper delays in real-time processing. Users of the audio industry have done a lot of research on how to configure their Windows workstations to get low latency, so there's a lot of information on the web. Recommended UseDAW (Digital Audio Workstation)Keywords. Of course there are also computer architects who tailor tailoring and providing desktops and notebooks to handle low-latency tasks on Windows.
SummaryAs you can see, enabling Windows systems to handle real-time tasks requires some measurement, confusion, and even consideration of some non-specific factors. After you have configured and tested a system for your solution, you may still feel uncomfortable in providing your customers with assurance that the solution can withstand testing and analysis. Remember: Windows is not an RTOS. If your solution requires Windows to ensure the response time, you may most of the time require your solution to rely on the end user's configuration. The question shifts to how often your real-time requirements cannot be met, what the results of these requirements are not met, whether there are any cropping or system configurations that need to be implemented to improve the results. Hopefully this article explains the problem, how to service some questions, and what resources are available for the core developers who need to deliver a real-time, sensitive solution under Miscosoft windows. Original link: https://www.osr.com/nt-insider/2014-issue3/windows-real-time/http://thehub.musiciansfriend.com/tech-tips/ Tech-tip-optimizing-windows-for-daws
Translation: Windows and Real-time--daniel Terhell