The concept of Dual-machine Hot Standby includes two meanings: broad sense and narrow sense.In a broad sense, two servers are used to back up and execute the same service for important services. When one server fails, the other server can assume service tasks, so that the system can automatically guarantee continuous service without manual intervention. The standby server solves the problem of uninterrupted service when the primary server fails. However, in practical applications, multiple servers may occur, namely, Server clusters. In general, shared storage devices are required for dual-host hot backup. However, in some cases, two independent servers can also be used. For dual-machine Hot Standby, professional cluster software or dual-machine software is required. In a narrow sense, dual-host Hot Standby refers to active/standby-based hot standby. Server data includes database data written to two or more servers at the same time, or a shared storage device. Only one server runs at a time. When one of the running servers fails to start, the other backup server activates the standby machine through a Software Test (usually through heartbeat diagnosis, ensure that the application is completely restored to normal use within a short period of time. (Related articles: Dual-machine Hot Standby, dual-machine mutual standby and dual-machine duplex ). We can look at the typical hot standby mode through the Typical Dual-machine Hot Standby software PCL ha:-active/active mode-active/standby mode, but in fact, dual-machine Hot Backup may expand to a multi-machine cluster: in multi-machine cluster mode, dual-machine hot backup is generally used for applications with databases or other data. The Application Server (or other services without Data Writing operations) before data should be classified into the Server Load balancer field. Why is dual-host Hot Backup required? Dual-machine Hot Standby is designed for server faults. Server faults may be caused by various causes, such as equipment faults, operating system faults, and software system faults. Generally, it may take 10 minutes, several hours, or even a few days for technicians to restore the server to a normal state. From the actual experience, it usually takes more than a few hours unless the server is simply restarted (which may remain risky. If the technicians are not on site, the service will be restored for a longer time. For some important systems, it is hard for users to endure such long service interruptions. Therefore, dual-host hot backup is required to avoid long service interruption and ensure the system's long-term and reliable services. Determine whether to use dual-host hot backup. The correct method is to analyze the importance of the system and the degree of tolerance for service interruption, so as to determine whether to use dual-host hot backup. That is, how long can your users tolerate service restoration and how much impact will be caused if the service cannot be restored. When considering dual-machine Hot Standby, you should note that in general, there will be a switching process for dual-machine Hot Standby, which may be about one minute. During the switchover, the Service may be interrupted for a short time. However, after the switchover is complete, the service will be restored normally. Therefore, dual-host Hot Standby is not seamless and uninterrupted, but it can ensure that normal services can be quickly restored in the case of system faults, so that the business will not be affected. If there is no dual-machine hot backup, once a Server failure occurs, the Service may be interrupted for several hours, and the impact on the service may be very serious. The concept of server faults is much higher than that of switches and storage devices. The reason is that servers are much more complex than switches and storage devices. They are also complex systems that include both hardware, operating systems, and application software systems. Not only may Equipment Faults Cause service interruptions, but also software problems may cause the server to fail to work normally. It should also be pointed out that some other protection measures, such as disk array (RAID) and data backup, are very important, but cannot replace the function of Dual-machine hot backup.
Dual-machine Hot Standby: based on shared storage and pure softwareTwo typical hot standby modes are available for databases. One is standard, and the two servers use a shared storage device (generally a shared disk array or a storage area network San ), the dual-machine software is installed to implement dual-machine hot backup, which is called the sharing method. The other method is purely software, which is generally referred to as pure software or mirror ). For the sharing mode, the database is placed on the shared storage device. When a server provides services, it reads and writes directly on the storage device. When the system switches, the other server also reads data from the storage device. For software-only methods, the image software can copy data to another server in real time, so that the same data exists on each of the two servers, if one server fails, you can switch to another server in time. The pure software method can reduce the cost to a certain extent, but it also has obvious disadvantages: 1. The reliability is relatively poor, and real-time data replication between two servers is a relatively fragile link. 2. Once a server is interrupted, complicated data synchronization and recovery are required. In addition, the system is not protected during this period. 3. there is no transaction mechanism. Because the replication is performed on the file and disk layers, whether the replication is successful does not affect the database transaction operations. Therefore, data is incomplete, there is a considerable risk.
Therefore, it is recommended that you do not select a pure software solution unless you have.On the other hand, there is a database parallel solution. instead of copying files or disks, it directly routes and distributes database operations at the front end, the database is updated in parallel using a transaction mechanism, and parallel services of the database can also be provided. This method is currently very successful for SQL Server applications, it is much better than the shared storage + dual-machine software method. The difference between dual-machine hot backup and dual-machine hot backup is the active/standby method, server data includes database data written to two or more servers at the same time, or a shared storage device. When the Active Server fails, the standby machine is activated through the software test (usually through heartbeat diagnosis) to ensure that the application can be completely restored to normal use in a short time. Dual-host mutual backup: On the basis of Dual-host hot backup, two relatively independent applications run on both machines at the same time, but both are set as backup machines. When a server fails, the application of the faulty server can be taken over by another server in a short time, thus ensuring the application continuity. This method is actually an application of hot standby. It avoids two applications from using four servers for dual-machine hot backup. Dual-duplex, two or more servers are active, and the same application is run at the same time to ensure the overall performance and achieve load balancing and mutual backup. Use cabinet storage technology (San is preferred ). For database services, it also requires the support of database software, which is complicated. Web servers or application servers are simple. Relationship between dual-machine hot backup and Data Backup some users may have this problem when planning dual-machine hot backup or dual-Machine backup: I already have raid and tape backup, is there a need for dual-host? Or, if I have a dual-Machine backup, is it necessary to back up the tape? Raid and data backup are both important. However, the raid technology can only solve the hard disk problem, and the backup can only solve the recovery after the system problem occurs. Once the server itself encounters problems, service interruption may occur, regardless of the hardware or software system of the device. Therefore, raid and Data Backup cannot solve the problem of service interruption. Dual-host systems that require continuous and reliable application service are very important. Just think about how long it will take to restore your server to work normally if it breaks down, so that your users can understand the importance of dual-host servers. In addition, raid and tape backup are also very necessary. For raid, the system reliability can be greatly improved at a low cost, and its complexity is far lower than that of Dual-host systems. After all, hard disks are the most frequently operated and vulnerable components in the system. If raid is used, the faulty system can be easily repaired, it also reduces the number of server downtime switchover times. Data backup is an essential measure. Because raid and dual-host backup are both real-time backups. Any software errors, viruses, and misoperations will be affected in multiple copies of data synchronously. Therefore, data must be backed up (no matter what media is used, it is recommended that you have at least one offline backup) so that data can be restored when data is damaged or lost. Dual-machine Hot Backup vs single-machine fault tolerance currently the mainstream application of server fault tolerance technologies include: server cluster technology, dual-machine Hot Backup Technology and single-Machine Fault Tolerance Technology. Their respective fault tolerance levels are from low to high, that is to say, the server cluster technology has the lowest fault tolerance level, while the Single Machine Fault Tolerance Technology has the highest level. From this we can see that the Industrial Fault Tolerance requirements of their respective applications are also from low to high. This article mainly introduces the second two fault tolerance technologies. Let's take a look at the dual-machine Hot Backup Fault Tolerance Technology.
I. Dual-machine Hot Backup TechnologyDual-machine Hot Backup technology is a highly fault-tolerant application solution combining hardware and software. This solution is composed of two server systems and an external shared disk array Cabinet (or not, but a RAID card is used in their respective servers) and corresponding dual-machine hot backup software, as shown in figure 1.
Figure 1
In this fault tolerance solution, the operating system and applications are installed on the local system disk of the two servers. The data of the entire network system is centrally managed and backed up through the disk array. Centralized data management uses a dual-host hot backup system to directly read and store the data of all sites from the central storage device, which is managed by professional personnel, greatly protecting the security and confidentiality of data. User data is stored in an external shared disk array. When a server fails, the slave takes the initiative to replace the host to ensure uninterrupted network services. The dual-machine hot backup system uses the heartbeat method to ensure the connection between the Master System and the backup system. The so-called "Heartbeat" refers to the communication signal sent between the master and slave systems at a certain interval, indicating the current running status of the respective systems. Once the "Heartbeat" signal indicates that the host system is faulty, or the standby system cannot receive the "Heartbeat" signal from the host system, the system's High Availability Management Software determines that the host system is faulty, the host stops working and transfers system resources to the standby system. The standby system will replace the host to ensure uninterrupted operation of network services. In the dual-machine hot backup solution, there can be three different working modes based on the working methods of the two servers: Dual-machine hot backup mode, dual-machine mutual backup mode, and dual-machine duplex mode. The following is a brief introduction. The hot standby mode is usually called the active/standby mode. The active server is in the working state, and the Standby server is in the monitoring preparation state, server data includes database data written to two or more servers at the same time (usually each server uses a raid disk array) to ensure real-time data synchronization. When the Active Server fails, the standby machine is activated through software testing or manual, to ensure that the application can be completely restored to normal use in a short period of time. Typical applications are securities capital servers or quote servers. This mode is widely used. However, because another server is in the STANDBY state for a long time, there is a waste of computing resources. In the dual-host mutual backup mode, two independent applications run on both machines at the same time, but both are set as backup machines. When a server fails, the application of the faulty server can be taken over by another server in a short period of time to ensure application continuity, but the performance requirements on the server are relatively high. The configuration is relatively good. Dual-duplex mode: it is a type of cluster. Both servers are active and run the same application at the same time to ensure the overall performance, server Load balancer and mutual backup are also implemented. The storage technology of the Cabinet must be used (San mode is preferred ). There are many web servers or FTP servers in this way.
Ii. standalone Fault Tolerance TechnologyFrom the above analysis, we know that the dual-host Hot Backup technology uses two server systems with identical configurations. In fact, the Fault Tolerance Technology in the server cluster solution is also a multi-server fault tolerance technology. The standalone Fault Tolerance Technology introduced in this section is to implement high-performance fault tolerance on one server. Its fault tolerance capability is far higher than the fault tolerance capability in the server cluster and dual-machine hot backup, therefore, it is more suitable for industries with extremely demanding Fault Tolerance capabilities, such as securities, telecommunications, finance, and medical. In the past, in the case of a cluster system failure, it was necessary to interrupt the operation of the server, and then switch to the backup server for running for a certain period of time before the service can be repaired and restored, the cost and the loss are the least visible to users. A fault-tolerant server with Fault Tolerance Technology has the biggest advantage of its ability to automatically separate fault modules, switch modules without interrupting the operation, and maintain damaged parts, after all physical faults are eliminated, the system automatically re-synchronizes the operation, effectively solving customers' worries. Because of this, fault-tolerant servers with fault tolerance technologies are impacting the hot backup and cluster technologies that have emerged in the past few years, and are becoming increasingly popular. At the same time, it is even more rare that it can be implemented in servers that comply with industrial standards (IA architecture servers). This highly competitive cost advantage makes fault-tolerant servers highly visible. The Fault-Tolerant server backs up all the hardware in the system, including the CPU, memory, and I/O bus, by means of the CPU clock lock frequency; the real fault tolerance is achieved through synchronous operation of all redundant components in the system. Faults of any part of the system will not cause system pauses or data loss. Currently, many fault tolerance systems are Server Based on the IA architecture and fully compatible with Windows 2000. This allows for fault tolerance that can be achieved only in the earlier stage of the world. This fault tolerance technology improves the reliability of the IA server to 99.999%, and the server runs continuously. The positioning of Dual-machine hot backup and fault-tolerant servers is slightly different, which is determined by the availability difference between the two. Hot Backup can achieve 99.9% availability, but fault-tolerant servers can achieve 99.999% availability. In this way, most of the applications of Dual-host Hot Backup are used in industries with less stringent business continuity, such as public security systems, military systems, or individual manufacturing enterprises. Applications in these industries allow data to be interrupted for a short period of time. Industries with high requirements such as telecommunications, finance, securities, and medical care are the world of fault-tolerant servers. Note that the dual-host hot backup is not the same as the server cluster. The dual-host Hot Backup usually requires the same configuration of the two pairs of servers, while the server cluster does not have strict requirements in this regard, this is also confusing for many readers. In addition, the dual-machine hot backup method requires at least two servers, resulting in software procurement (operating system, middleware, dual-Machine backup software, etc) software maintenance and upgrade, and system hardware upgrade require an additional investment of more than doubled than the single-host Fault Tolerance mode. In addition, after the dual-host backup software fails, the maintenance is more difficult, it brings great difficulties to customers. Therefore, although the hardware cost of a standalone Fault-Tolerant server is higher than that of the dual-Machine backup mode, the total cost (TCO) is far lower than that of the dual-Machine backup mode. However, in terms of its flexible configuration, the dual-host hot backup solution has more advantages. Many hot backup solutions are implemented by some system integrators combining Server products of different manufacturers to meet the needs of different customers. But in general, fault-tolerant servers are the future trend. Common knowledge about hot standby
What is dual-host hot standby?The so-called dual-machine Hot Standby is to use the two servers that are mutually backed up to execute the same service, one of which is the work server (primary server), and the other is the backup server (standby server ). Under normal system conditions, the work machine provides services for the application system. The backup machine monitors the running status of the Work Machine (the work machine is also checking whether the backup machine is normal). When the work machine encounters an exception, during application system operation, the backup machine takes over the work machine and continues to support key application services to ensure uninterrupted system operation.
Under what circumstances should I use dual-host hot backup?You can determine whether to use dual-host Hot Backup Based on the importance of the system and the tolerance of end users for service interruptions. For example, how long can a network user tolerate service restoration? If the service cannot be quickly restored, what are the consequences. Servers that undertake key business applications of enterprises require extremely high stability and availability and require uninterrupted services. We recommend that you use dual-host hot backup.I already have raid and tape backup. Do I have to perform dual-host backup? Or, if I have a dual-Machine backup, is it necessary to back up the tape?Raid and data backup are both important. However, the raid technology can only solve the hard disk problem, and the backup can only solve the recovery after the system problem occurs. Once the server itself encounters problems, service interruption may occur, regardless of the hardware or software system of the device. Therefore, raid and data backup technologies cannot prevent service interruptions. Dual-host systems that require continuous and reliable application services are still necessary. Data backup is an essential measure to ensure data security. Because raid and dual-host backup are both real-time backups. Any software errors, viruses, and misoperations will be affected in multiple copies of data synchronously. Therefore, even if the dual-host solution is used for key services, data must be backed up so that data can be restored when data is damaged or lost.How do I select and implement a hot standby configuration scheme?1. application-oriented, high availability for the purpose of analyzing the necessity of demand; 2. Select a specific device, software model, and so on. It should be noted that there are compatibility issues between different software or storage devices such as hard disks. Therefore, professional personnel should be consulted before purchase, avoid incompatibility with related storage devices after purchasing dual-host software. 3. After the implementation is complete, you must perform tests to ensure that the system is working properly. In addition, you should check whether the system can be switched normally on a regular basis during the running process.