Reaffirm again
Although I have never found a shortage of issues related to high availability or disaster recovery, the previous issue of WebSphere reverse investors (dealing with WebSphere application Server management High Availability option) suggests that the number of these issues appears to have increased recently. So here I will continue with the topic of high availability and add some ideas about how to get high availability (HA) and continuous availability (CA). But before we start talking about these topics, let's make sure that you agree on the following two terms:
High Availability: An infrastructure (or an application running on it) cannot withstand unplanned outages for more than a few seconds or minutes, but doing so may not have a serious impact on the business. In addition, it is acceptable to occasionally terminate the application for a few hours for planned maintenance.
Continuous availability: An infrastructure (or an application running on it) cannot be interrupted at all. Basically, there is no interruption compensation for both unplanned and planned outages. This level of availability is often referred to as "five 9" or 99.999% availability, which is understood to be a total of just over 5 minutes of planned and unplanned outages in a year.
To add that, in many cases, some people will claim that they "only" need the availability of "four 9" (99.99%) or a similar number, assuming that such availability is categorized as HA, there is virtually no meaningful difference between 99.99% and 99.999% availability over a one-year period. If you do a mathematical calculation, you will find that 99.99% availability requires a total of just over 5 hours of interruption per year; in other words, you cannot tolerate unplanned outages exceeding 99.999% availability, nor can you make compensation for planned outages.
Here I no longer describe the specifics of running the WebSphere application Server network deployment, as these are documented in this book and in this article. After reading one or two references, it is clear that a single network deployment unit can also provide HA, through carefully managed processes and detailed planning. Also, two-unit management is slightly increased (because you need to manage two units), and by virtue of these advantages, the complexity of management from an operational standpoint is greatly simplified, whether it is an HA or CA environment.
Hardware isolation
There are two (or more) units without separate hardware-you can use coexistence to run multiple units on the same hardware-which makes the hardware more fully available for each unit. This is because if each unit has separate hardware, it provides complete hardware and network isolation between the units. If a server, server framework, route, or other device within a unit becomes unusable, the hardware isolation ensures that the situation does not affect other cells. In this way, units with faulty devices are taken offline and production is serviced by the remaining units. Any repair work on the faulty device will not affect the other units, and once the repair is complete, it can be tested independently without affecting production. Multiple units, each with separate hardware, can also provide a way for hardware upgrades. This is because when a server or server framework is swapped out, a unit can be rotated out of production and processed by other units to handle the load for memory and CPU upgrades. As in the case of repair, once the upgrade is complete, the software and hardware merge test can be carried out, if some problems occur during the upgrade and cause hardware failure, it will not adversely affect any users.
Software isolation
With multiple units, you can apply maintenance software (such as revision packs, patches, and so on) to the operating system, infrastructure middleware, or the application itself by spinning a unit out of production. Once the upgrade is complete, the software and hardware merge test can be carried out, and if some problems occur during the upgrade, it will not adversely affect any users.
Obviously, this situation becomes more complex if the application upgrade requires a corresponding change to the shared database schema. When you use multiple units for this type of upgrade, you need to consider other database update policies, because the cells that are running in production will not recognize the new database schema.
For database updates, if you attempt to update a single database server that is still processing application data requests, the operational complexity introduced is similar to trying to use a single unit to meet the HA or CA service level requirements and to apply hardware or software maintenance at the same time. This is why other administrative work (which should be as simple as running two times-and one per unit-the same management script) should be a significant compromise of the result of simplifying the operation process.