Original address: http://blog.csdn.net/hilyoo/article/details/7704280
1. Cap theory
1) The CAP theory gives 3 basic elements:
- Consistency ( Consistency): Any read operation can always read the result of a previously completed write operation;
- Availability ( availability): Each operation is always able to return within a determined time;
- Partition tolerance (tolerance of network Partition): Consistency and availability can still be met in the presence of networks partitions;
CAP theory states that the three cannot be met at the same time. There are many objections to this theory, but its reference value is still huge.
This theory does not provide an excuse for not meeting the 3 basic requirements of the design, only that the theoretical 3 can not be absolutely satisfied, and the project never requires absolute consistency or availability, but must seek a balance and optimal.
For distributed Data Systems, partition tolerance is the basic requirement. Therefore, the design of distributed data systems, often in the consistency and availability (reliability) to find a balance between. More discussion of system performance and architecture is also being carried out around consistency and availability.
2) engineering practices for OpenStack, Swift and Cap
In contrast to cap theory, OpenStack's distributed object storage System, swift, satisfies availability and partition tolerance, without guaranteeing consistency (optional), but achieving eventual consistency. Swift if the get operation does not include the ' x-newest ' header in the request header, then this read might not read the latest object, and the object is not updated in the time of the consistency window, then the object read by the subsequent get operation will be up to date, guaranteeing eventual consistency The reverse contains the ' x-newest ' header, and the get operation is always read to the latest obejct, which is consistent.
In the OpenStack architecture, there is much work to be done to ensure high availability. As a result, the availability of the OpenStack structure is discussed below:
Build OpenStack High Availability (Ha,high availability)
2. OpenStack High Availability (OpenStack HA) to figure out how to achieve high availability, you need to know which services are prone to unreliable. Start by understanding some of the approximate structure of OpenStack. OpenStack consists of 5 large components (compute Nova, Identity Management Keystone, image management glance, front-end management dashboard and object storage Swift). Nova is the core component of computing and control, and it includes services such as Nova-compute, Nova-scheduler, Nova-volume, Nova-network, and Nova-api. Borrow the following diagram from Http://ken.people.info to learn about the 5 components and features of OpenStack:
The following diagram depicts the functionality and service structure of each component: Like most distributed systems, OpenStack is divided into nodes that control nodes and compute nodes in two different functions. The control node provides services other than Nova-compute. These components and services can be installed independently and can be selected for combination. Nova-compute runs at each compute node, assuming it is trustworthy, or using a backup machine for failover (although the cost of configuring a backup per compute node seems to be too large for the benefit). The high reliability of control nodes is the main problem, and has its own high reliability requirements and solutions for different components.
(1) because Cotrolnode only 1, and is responsible for the management and control of the entire system, so when Cotrol node can not provide normal service, how to do? This is the common single-node failure (Spof,single point of failure) issue.
High availability is basically no way to achieve the goal by one, and more often it is designed to ensure that the faulty machine is taken over as quickly as possible when the problem is in trouble, which costs a lot more.
For a single point of problem, the solution is generally used redundant equipment or hot standby, because of hardware errors or artificial reasons, it is always possible to cause the failure of single or multiple nodes, sometimes do node maintenance or upgrade, also need to temporarily stop some nodes, so a reliable system must be able to withstand the stop of single or multiple nodes.
Common deployment modes are: Active-passive primary and Standby mode, active-active dual active mode, cluster mode.
(2) So how to build a redundant control node? Or what other ways to achieve high-reliability control?
Many people may think of implementing active-passive mode, using a heartbeat mechanism or a similar method for backup, and failover to achieve high reliability. OpenStack does not have multiple control nodes, and pacemaker requires a variety of services to implement this backup, monitoring, and switching.
Careful analysis of the services provided by the control node, mainly Nova-api, Nova-network, Nova-schedule, Nova-volume, and Glance, Keysonte and database MySQL, etc., these services are provided separately. Nova-api, nova-network, glance, etc. can work on each compute node separately, RABBITMQ can work in the main standby mode, and MySQL can use redundant high-availability clusters.
The following are respectively described:
1) high reliability of NOVA-API and Nova-scheduler each compute node can run its own nova-api and Nova-scheduler, providing load balancing to ensure that it works correctly. In this way, when the control node fails, the NOVA-API of the compute nodes are performed as usual. 2) Nova-volume high reliability for Nova-volume there is currently no perfect ha (Hi availability) method, there is still a lot of work to do. However, the Nova-volume is iSCSI-driven, a protocol that combines with DRBD, or an iSCSI-based, high-reliability hardware solution that enables high reliability.
3) High reliability of network service Nova-network
OpenStack's network already has a variety of high-reliability solutions, often you only need to use the --multi_host
option to make the network service in high-availability mode (Hi availability mode), specifically described in existing-availability Options for Networking.
Solution 1:multi-host
Multiple hosts. Configure nova-network on each compute node. In this way, each compute node will implement NAT, DHCP, and gateway functions, which inevitably require some overhead and can be combined with hardware gateway to avoid the gateway function of each compute node. In this way, each compute node needs to be installed Nova-compute and nova-network and Nova-api, and it needs to be able to connect to the extranet. See Nova Multi-host Mode against SPoF for details.
Solution 2:failover
Fail over. Can transfer 4 seconds to the hot backup, detailed introduction see https://lists.launchpad.net/openstack/msg02099.html. The disadvantage is that a backup machine is required and has a 4-second delay.
Solution 3:multi-nic
Multi-NIC Technology. Bridging VMs to multiple networks, VMS have 2 outgoing routes for failover. But this requires listening to multiple networks and designing a handoff strategy.
Solution 4:hardware Gateway
Hardware gateways. You need to configure an external gateway. Because VLAN mode requires a gateway for each network, the hardware gateway method can only use one gateway for all instances, so it cannot be used in VLAN mode.
Scenario 5:quantum (OpenStack next version Folsom)
The goal of quantum is to progressively implement a fully functional virtual network service. It will continue to be compatible with old nova-network functions such as flat, FLATDHCP, etc. However, similar multi_host functions are implemented, enabling OpenStack to work in the main standby mode (Active-backup this high availability mode).
Quantum only requires an instance of Nova-network to run, and therefore cannot work with Multi_host mode.
Quantum allows a single tenant to have multiple private dedicated L2 networks, which, by enforcing QoS, should enable Hadoop clusters to work well on the Nova node in the future.
For quantum installation use, this article Quantum setup has an introduction.
4) high reliability of Glance and Keystone
The image of OpenStack can be stored using swift, glance can run on multiple hosts. Integrating OpenStack ImageService (Glance) with Swift describes Glance using Swift storage.
The Cluster management tool Pacemaker is a powerful high availability solution that manages multi-node clusters for service switching and transfer, and can be used with Corosync and heartbeat. Pacemaker can be more flexible to achieve the main preparation, n+1, n-n and many other modes.
Bringing-high-availability-openstack-keystone-and-glance describes how to achieve high reliability of Keystone and glance through pacemaker. After installing the OCF agent on each node, it is able to tell the other node whether the glance and Keysone services are running correctly, helping pacemaker to open, stop, and monitor these services. 5) High reliability of Swift object storage in general, OpenStack's distributed object storage System, Swift ha, does not need to be added by itself. Because Swift is designed to be distributed (without a master node), fault tolerance, redundancy, data recovery mechanisms, scalability, and high reliability. The following are some of the benefits of Swift, which is also illustrated.
Built-in Replication (N copies of accounts, container, objects) 3x+ data redundancy compared to 2x on RAID Built-in redundancy mechanism RAID technology only makes two backups, and Swift has a minimum of 3 backups |
High Availability High reliability |
Easily add capacity unlike RAID resize Easy Storage Expansion |
Elastic Data scaling with ease Convenient capacity for expansion |
No Central Database No center node |
Higher performance, No bottlenecks High performance with no bottleneck limit |
6) High reliability of Message Queuing service RABBITMQ
RABBITMQ failure results in the loss of messages and can have multiple ha mechanisms:
- The Publisher confirms method can notify you of what is written to the disk when it fails.
- Multi-machine cluster mechanism, but node failure can easily lead to queue failure.
- Primary standby mode (active-passive), which can be transferred in case of failure, but starting the backup machine may require delay or even failure.
In terms of disaster tolerance and availability, RABBITMQ provides a sustainable queue. The ability to persist unhandled messages to disk when the queue service crashes. To avoid information loss due to delays between sending messages to write messages, RABBITMQ introduces the Publisher confirm mechanism to ensure that messages are actually written to disk. It offers two modes of active/passive and active/active for cluster support. For example, in active/passive mode, once a node fails, the passive node is immediately activated and quickly replaces the failed active node, assuming the responsibility for message delivery. :
Figure Active/passive Cluster (image from RABBITMQ official website)
The active-passive model has the problem, so the introduction of a dual active cluster mechanism (active-active) based on the RABBITMQ cluster solves these problems. Http://www.rabbitmq.com/ha.html This article details the high-reliability deployment and principles of RABBITMQ.
7) High reliability of MySQL database
Cluster is not high reliable, commonly used to build high-reliable MySQL method has active-passive master Standby mode: Using DRBD to achieve the main standby disaster capacity, heartbeat or corosync do heartbeat monitoring, service switching and even failover, Pacemaker the switching and control of services (resources), or similar mechanisms. The main use of pacemaker to achieve the MySQL active-passive high-availability cluster.
One important technique is DRBD: (Distributed replication block device), or distributed replication block devices, which is often used in place of shared disks.
It works by: on a host on the specified disk device write request, the data sent to a host of kernel, and then through a module in kernel, the same data to the B host of kernel, and then the B host to write its own designated disk device, Thus the synchronization of the two host data is realized, and the write operation is highly available. DRBD is generally a master one from, and all the read and write operations, Mount can only be performed on the primary node server, but between the master-slave DRBD server can be exchanged. Here is an introduction to DRBD.
Hafornovadb-openstack describes the high reliability of OpenStack through pacemaker by using only shared disks and not using DRBD.
Novazookeeperheartbeat describes the use of zookeeper for heartbeat detection.
MySQL HA with Pacemaker describes the use of Pacemaker to provide highly reliable services, which is also a common solution. Galera is an open source project for MYSQL/INNODB Synchronous multi-master cluster, which provides many advantages (such as synchronous replication, read and write to any node, automatic member control, automatic node join, small delay, etc.), can be referenced. The working modes of pacemaker and DRBD and MySQL can be consulted as follows:
Other programs, according to Mysqlperformance Blog, the availability of several high-availability solutions for MySQL can be as follows:
3. Building high-availability OpenStack (high-availability OpenStack)
In general, high availability is the creation of redundant backups, common strategies are:
- Cluster working mode . Multi-machine interoperability, such that the mode is to backup each instance multiple copies, no central node, such as the distributed object Storage System Swift, nova-network multi-host mode.
- Autonomous mode . Sometimes, solving a single point of failure (SPoF) can simply use each node to work autonomously, through the master-slave relationship to reduce the failure of the main control node, such as NOVA-API only responsible for their own node.
- Primary and Standby mode . Common mode active-passive, passive node in listening and backup mode, when the failure of timely switch, such as MySQL high-availability cluster, glance, Keystone use pacemaker and heartbeat to achieve.
- Dual Master mode . This mode of mutual assistance, RABBITMQ is the active-active cluster high availability, the nodes in the cluster can be replicated in the queue. Architecturally, so you don't have to worry that the passive node won't start or the delay is too big?
In summary, the deployment and application of OpenStack is constantly being tried and developed for OpenStack optimization and improvement. Need to practice tuning . Practice is very important, good design and ideas need to be proven in practice.
Each of the services of OpenStack is described above, and I am purely a catalyst. Want to practice more (you can refer to the previous one-click deployment of OpenStack), but also want to have time for friends to be able to add some high-reliability scenarios to the deployment of Onestack/hastack. For a discussion and description of OpenStack high availability, you can take a look at these places: http://docs.openstack.org/http://wiki.openstack.orghttp://www.hastexo.com/ Blogs/florian/2012/03/21/high-availability-openstackexisting High Availability Options for Networkingbringing-high-availability-openstack-keystone-and-glancequantum Setupmysql HA with Pacemakerhttp:// Www.rabbitmq.com/ha.html
FW built for high availability of OpenStack (Ha,high availability)