When we talk about something that is reliable, we mean it is trustworthy and predictable. But in terms of software, there must be other important attributes to be able to say that the code is reliable.
Software must be resilient, meaning that it can continue to function correctly in the event of internal and external interruptions. It must be recoverable so that it knows how to restore itself to a previously known consistent state. The software must be predictable so that it provides a timely and expected service. It must be non-disruptive, meaning that changes and upgrades do not affect its services. Finally, the software must be production ready, meaning it contains the fewest bugs, and only requires a limited number of updates. If these conditions are met, then the software is truly reliable.
These key attributes of reliable code depend on different factors-some depend on the overall architecture of the software, some depend on the operating system that will run the software, and others depend on the tools used to develop the application and the framework on which the application is built. Resilience is a property that depends on each layer, and the resilience of the application depends on its weakest link.
Now, imagine an application based on the Microsoft®.net Framework. These application delegates run with certain actions that do not exist in the native environment (such as Just-in-time compilation of IL code) or are already under the direct control of the developer (for example, memory management). In terms of reliability, the platform itself can introduce its own fault points that affect the reliability of the application running on it. It is important to understand where these failures may occur and what technologies can be used to create more reliable, based on. NET applications.
Understanding Run-time Failures
Some exception events can occur at any time, in any code segment. These events are collectively referred to as asynchronous exceptions, including resource depletion (out-of-memory and stack Overflow), thread termination, and access violations. (The access violation occurs at run time when managed code is executed.) )
This last case is not very meaningful-if such an event does occur, it means that a serious bug has been found in the common language runtime (CLR) implementation and should be fixed. But for the first two cases, it is necessary to carry out further analysis.
In theory, we would assume that resource depletion would be properly managed at runtime, and that they would never affect the ability of application code to continue to run. But this is a theory, and the reality is much more complicated.
To illustrate this issue, let's first look at how some common server applications handle Out-of-memory (OOM) events. Server applications that require high availability, such as ASP.net and Exchange Server 2007, have been achieved through AppDomain and process recycling. The operating system provides a very powerful mechanism for cleaning up most of the other resources used by memory and processes-all of which are completed after the process terminates.
As far as the client is concerned, when the memory pressure reaches a point where even a small allocation can fail, due to severe overload and paging, the overall system will go to a certain level of unresponsive state, resulting in the user would prefer to press the reset button or seek help from Task Manager, also do not want to wait for any recovery code execution. In a sense, the user's first reaction is to manually perform the same action that ASP.net or Exchange 2007 automatically performs.
Some OOM are not even caused by any particular problem that runs the code. Other processes running on the computer or other AppDomain running in the process may consume the available resource pools and cause the allocation to fail. In this sense, the exhaustion of resources should be considered asynchronous, since they can occur at any time in the execution of code, and they may depend on various environmental factors that run the code outside and are independent of the running code.
The problem becomes even more serious because the runtime may allocate memory to perform operations related to its own operation. Here are several examples of assignments that occur at run time that can fail in a resource-constrained environment:
Boxing and unboxing
Deferred class loading until the first time a class is used
Remote operations on a Marshalbyref object
Some actions on a string
Security checks
JITing method