Ignacio M. Liorente, from the Opennebula project, recently published an article entitled EUCALYPTUS, Cloudstack, OPENSTACK and opennebula:a tale of NonBlank Models's blog article analyzes the differences between the four cloud management platforms of Eucalyptus, Cloudstack, OpenStack and Opennebula from the perspective of application scenarios. Ignacio that VMware Vcloud and AWS represent two typical scenarios for data center virtualization and on-demand computing resources, as described above four open source cloud management platforms are basically VMware Vcloud or AWS are reference prototypes but differ in the implementation details from the reference prototypes. Ignacio the difference between the open source cloud platform and its reference prototype as flexibility (flexibility), with data center virtualization, on-demand computing resources, low flexibility, and high flexibility for the four quadrant to place the above four open source cloud management platforms in different locations, as shown in the following illustration. Ignacio further points out that the legend is not meant to show that an open source cloud platform is superior to other open source cloud platforms, but to demonstrate that different open source cloud management platforms apply to different customer requirements and different scenarios. In the current situation, the private cloud market is large, customer demand and application scenarios are very different, there is no one can take all the application scenarios of the cloud management platform. There will be competition and cooperation between the four cloud management platforms of the future Eucalyptus, Cloudstack, OpenStack and Opennebula, and the market and customers in this kind of competition and cooperation can be found.
I basically agree with Ignacio M. Liorente that different cloud management platforms apply to different customer requirements and different scenarios, and that there is no cloud management platform that can take all the scenarios. By the same token, Eucalyptus also has its own good application scenarios, as well as the applications they are not good at. As a eucalyptus employee, I naturally hope that users from all walks of life use eucalyptus to build their private cloud. But in order to give full play to the potential of eucalyptus, I recommend that all potential customers first understand what eucalyptus is (or not), what eucalyptus can do (or can't do), and how to plan, implement, Use a private cloud based on eucalyptus.
Eucalyptus is (not) what?
Eucalyptus is an open source, highly compliant cloud management platform with AWS. Various cloud management platforms (such as OpenStack), which use AWS as reference prototypes, are somehow compatible with the AWS API, but only eucalyptus will be loyal to the AWS API to the level of corporate strategy and core competencies. Loyal compatibility with the AWS API means that customers can continue to use a variety of existing tools, scripts, and images (AMI) that are compatible with the AWS API in a private cloud environment to migrate loads and data between the Eucalyptus private cloud and the AWS Public Cloud, Or the private cloud based on eucalyptus as the development test environment but the AWS public cloud as the production environment.
Based on the cloud management platform four quadrant of Ignacio M. Liorente, VMWare Vcloud and AWS represent two typical scenarios for data center virtualization and on-demand computing resources. A cloud management platform that uses data center virtualization as a scenario often takes a bottom-up architecture designed to address the complexity of the data center; To obtain computing resources on demand the cloud management platform for the scenario typically takes a top-down architectural design designed to provide computing resources through a simple and efficient interface. The differences in design ideas make it difficult for a cloud management platform to have all the features of VMware Vcloud and AWS. Eucalyptus's high compatibility with AWS determines that eucalyptus is not a substitute for VMware Vcloud or VMware vcenter. Eucalyptus and VMware are trying to solve different problems that apply to different scenarios and therefore have different capabilities and features. There are often users who use Eucalyptus and VMware vcloud or VMware vcenter for feature or feature comparisons. They don't realize that eucalyptus and VMware Vcloud or VMware vcenter are not exactly the same type of software, there is no way to directly perform functional or feature comparisons.
It is worth mentioning that eucalyptus is an open source product, but Eucalyptus does not provide software customization services for specific customers. There are often potential customers asking if we can provide a customized version of it. Indeed, as the main developer of an Open-source project, Eucalyptus has the ability to offer specific versions to specific customers, but eucalyptus often does not. Providing a customized version of a specific customer means that the product is modified, which means that users who use a customized version may experience unpredictable risks when upgrading to a later version of Eucalyptus. Although customizing the version in the short term may solve some problems for the customer, in the long run it brings more problems than it solves. (Eucalyptus is an open source project, if customers are willing and have the appropriate development capabilities, of course, can also be customized to eucalyptus. However, users can also experience escalation problems by customizing their own eucalyptus. )
What can eucalyptus do?
The latest release of Eucalyptus is 3.2.1, which can be used well to develop test environments or to support a variety of scalable Web services. The common feature of both scenarios is the extensive use of non-persistent virtual machine instances (ephemeral Instance) and the use of Resilient block storage (EBS) to hold persistent data. Although Eucalyptus also supports a persistent virtual machine instance from the resilient block store boot (boot from EBS, Bfebs), the performance of the entire cluster is reduced due to architectural design reasons for a large number of Bfebs instances in a cluster. The larger the number of Bfebs instances in a cluster, the worse the performance of the cluster. Therefore, we do not recommend that customers run a large number of bfebs instances on eucalyptus.
Eucalyptus also does not support a variety of disk IO-intensive applications, such as database applications that require high-speed read-write disks. Strictly speaking, this is not the problem of eucalyptus itself, but the underlying virtualization technology. Current virtualization technologies-such as VMware ESX, Xen, KVM, and so on-have been a good solution to the CPU and memory performance loss, but there is still some performance loss in disk IO. Therefore, we do not recommend that customers run various disk IO-intensive applications on virtual machines, including heavily loaded database applications.
Within VMware vcenter, system administrators can customize network parameters for application based on application characteristics. In eucalyptus, if the system administrator wishes to have the same ability, I'm afraid he will soon be disappointed. To provide computing resources in a simple and efficient way, Eucalyptus manages the network configuration of the entire private cloud as automatically as possible, leaving little room for system administrators to play freely.
Eucalyptus hardware Topology
Next, we'll introduce a few typical hardware topologies to help you get a deeper understanding of the scenarios that are appropriate for eucalyptus. There are some abbreviations in these topology diagrams, meaning the following:
CLC-Cloud Controller (Cloud Controller), the front-end component in Eucalyptus walrus-eucalyptus, an object storage services CC-Cluster controller (S3 Cluster) similar to Amazon Controller, Manages a eucalyptus cluster SC-storage controller (Storage Controller), provides resilient block storage (EBS) services for a Eucalyptus NC-compute node (node Controller) GE-gigabit Network ge-Gigabit FC -Fibre Channel San-san storage
The above topology shows a small euclayptus private cloud with only one compute cluster. In this private cloud, all servers are connected to a gigabit network switch, where CLC and walrus are deployed together on the same physical server, CC and SC are deployed on the same physical server, and SAN storage is connected to the Walrus and SC servers via Fibre Channel. In order to further reduce costs, SAN storage devices can also be replaced with DAS storage devices.
Under such a topology, the eucalyptus uses an Open-source iSCSI TGT driver to provide the EBS service. A lot of practice has shown that the open source iSCSI TGT driver has a stability problem, in the case of a large storage pressure will be inexplicably collapsed. (This problem does not just exist in eucalyptus, it happens in all applications that use open source iSCSI TGT drivers.) The EBS service has a stability problem, which means that a running virtual machine may suddenly fail to access the mounted resilient block storage device, which means that the Bfebs instance that is started from the resilient block store suddenly crashes. Such topologies are not very problematic in development test environments where data persistence requirements are not high, but we do not recommend that customers use them in a production environment that requires a high level of data persistence.
In addition to the problem of EBS stability, the bottleneck of this topology is that the connection between SC and the entire compute cluster is a gigabit network. The throughput of the entire compute cluster accessing the EBS service is limited by gigabit bandwidth, and the effective throughput is probably about MB/s. Assuming that the average pressure per EBS instance is 2 MB/s, a compute cluster can support 40 to 50 EBS instances at the same time. Assuming that the average pressure per EBS instance is 4 MB/s, a compute cluster can support only 20 to 25 EBS instances at the same time. It should be noted that 4 MB/s is equivalent to the throughput of a medium USB disk (USB 2.0) that is far less performance than the 7200 SATA drives commonly used in older laptops, while 2 MB/s is a very low performance extreme. If the customer wishes to use the EBS service heavily under such a topology, it is recommended that SC be connected to the switch through a gigabit network. (now many access switches are 2~4 with a million-gigabit interface, the cost of such a transformation is not large.) After this simple transformation, the effective throughput of the EBS service increased 10 times times, basically eliminating the performance bottleneck due to bandwidth throttling.
In a production environment, we recommend that customers use an ip-based SAN based storage device to provide the EBS service. The above topology shows a euclayptus private cloud that can be used in a production environment. In this private cloud, all Eucalyptus Front-End components (CLC, Walrus, CC, SC) are deployed on separate physical servers, and all potentially large-volume components (IP Sans, Walrus, CC, SC) are connected to the private cloud via gigabit networks. Eucalyptus currently supports IP San devices from multiple vendors, such as EMC, EqualLogic, and NetApp, where Eucalyptus uses the official-supported iSCSI-driven EBS service for stability and reliability with open source iSCSI The TGT is greatly improved.
Even so, we do not recommend that customers run a large number of bfebs instances on eucalyptus that are started from the resilient block store. Eucalyptus was designed to encourage users to use as many non-persistent virtual machine instances as possible, in this case, the virtual machine disk image is stored on the compute node, and the disk IO inside the virtual machine does not stress the network of the private cloud, and the virtual machines running on the different compute nodes will basically not interact with each other. Because the virtual machine instance that is started from the resilient block storage is a persistence instance, it requires frequent interaction with the storage device, and the pressure on network bandwidth is great. Therefore, our general recommendation to our customers is: (1) Use of a non-persistent virtual machine instance whenever possible, (2) If you have to use the Bfebs instance, make the operating system disk as small as possible, such as GB, and (3) separate the operating system disk from the data storage disk, That is, start a bfebs instance of a smaller size and mount a larger EBS volume to store persistent data.
The above topology can be further modified to separate the storage network from the service network. This allows storage traffic to not affect traffic, and health monitoring of private cloud components can also be done on the storage web.
When the capacity of a compute cluster reaches its limit, a new compute cluster can be added to the private cloud to enlarge it, and the above topology is formed.
Recommendations for the use of Eucalyptus
As we've said before, different cloud management platforms have different design concepts that apply to different scenarios. Some potential customers think that it is basically unrealistic to expect to be able to cope with all types of applications by investing in X hardware and installing z software in the y topology. Like AWS, Cloud computing is more of a new concept of managing and allocating computing resources than a new technology. Using a cloud-like service similar to AWS requires users to understand some basic concepts, and often requires some modification of the application to achieve the best results. The general recommendations we provide to Eucalyptus customers include:
(1) Use of a non-persistent virtual machine instance whenever possible;
(2) When the Bfebs instance must be used, the size of the disk image is minimized;
(3) using the EBS volume to save persistent data and separating the operating system disk and the data storage disk;
(4) Do not run disk IO intensive applications on virtual machines;
(5) Do not expand the virtual machine vertically, to improve the processing capacity of the application through horizontal expansion;
(6) The high availability of application is realized by load balance;
(7) Avoid the virtual machine level of online migration, when the physical server needs to be maintained, the use of Eucalyptus maintenance mode.
As shown in the figure above, we recommend that users deploy the same application to multiple compute clusters.
When a compute cluster fails, the application is still available, but its processing power is reduced.
For planned system maintenance, you can migrate the application load to a compute cluster that does not require maintenance, and then maintain the compute cluster in the idle state. This ensures the usability of the application and the processing ability of the application.
It should be noted that the "load migration" mentioned above does not mean moving a running virtual machine from one compute cluster to another, but creating a new virtual machine instance based on the same EMI in another computing cluster. and redirect the applied load through the load balancing settings to the new virtual machine instance. So far, eucalyptus does not provide a VM-level dynamic migration (similar to VMware vMotion) functionality. For cloud services like AWS, end users are completely isolated from the underlying infrastructure through multiple levels of abstraction. For the end user, he sees only his own virtual machine for computing resources. As for the physical server on which his virtual machine runs and the load on the server, the end user should be ignorant. Thus, providing virtual machine-level online migration capabilities is essentially a violation of the principle of isolating end users from the infrastructure through abstract measures.
A feature called maintenance mode is about to be provided in Eucalyptus 3.3, allowing the system administrator to flag specific physical servers in a compute cluster as maintenance state. The system automatically migrates all virtual machine instances on the specified physical server to other physical servers in the same compute cluster. This feature allows system administrators to maintain a specific physical server in the compute cluster without emptying all the load in the Compute cluster.
When the load of the application increases, the processing ability of the application is increased by scaling.
The example monitoring (monitoring), resilient load balancing (elastic load Balancing,elb), Automatic extension (auto Scaling,as) features to be provided in Eucalyptus 3.3 make it easier to extend the application horizontally. Simply put, the instance monitoring feature allows users to monitor their virtual machine instances, including the CPU, memory, disk IO, network IO, and so on, of the virtual machine instance; The automatic extension feature allows the user to set some simple triggering parameters, and automatically creates or destroys the virtual machine instance when the monitoring parameter reaches the triggering parameter requirement; The flexible load balancing function is responsible for modifying the load balancing policy associated with the application, automatically adding the newly created virtual machine instance, or removing the old virtual machine instance.
To sum up, we can get the general application architecture design that we propose to our customers. As you can see, this architectural design is basically consistent with the familiar Web application architecture. A typical Web application may need only a small amount of change (if applied at the beginning of the design to fully consider the horizontal expansion, or even completely do not need to change), you can take full advantage of the Eucalyptus Private cloud of the various features.
In addition, in the use of resilient block storage EBS services, it is recommended that interested users read the article why EBS is a bad idea. Although I am not entirely in agreement with the point of view in this article, many of the observations and reflections of the author are worthy of in-depth thinking by cloud administrators and application developers.
Eucalyptus capacity Planning
Now readers have a general idea of what Eucalyptus is (not), what it can (can't) do, and how to use a private cloud based on eucalyptus. If you think the above descriptions match your scenario and want to use eucalyptus to build your private cloud, we recommend that you do some simple capacity planning before you do it. The basic approach to capacity planning is to map the predictable load to the physical resources to ensure that the capacity of the private cloud is capable of hosting application and business pressures. In general, the parameters that need to be examined include the CPU (physical core quantity), memory, network (throughput), storage (throughput, and IOPS).
For CPU and memory, simple conversions can be made through a standardized VM product type (VM Types), and the result can be expressed as how many instances of a certain type of virtual machine can be supported by a particular hardware device.
For the network, it is necessary to sample the network behavior that will run on the private cloud, obtain its real traffic characteristic, and compare with the private cloud network configuration.
For storage, the capacity of a cluster is often limited by IOPS rather than throughput. Last month Graziano Obertelli, co-founder of the Eucalyptus company, wrote a blog article titled Will I Internet be faster, with a lengthy discussion of how capacity planning is based on IOPS. I will translate Graziano Obertelli blog article to "My network will become faster?" Can be used for reference.
Service Level agreement
Often have customers ask us: "After the eucalyptus, is it possible to ensure that my cloud host will not be down?" "Frankly, not only eucalyptus, but other cloud management platforms," he said.
Downtime, or downtime, is a wonderful parameter in the data center domain, often referred to as a service level agreement (AGREEMENT,SLA). Service level agreements are part of a service contract and are usually expressed as a percentage of annual downtime (annual Uptime percentage). It is clear that the total time allowed for a year of downtime can be calculated based on the percentage of annual downtime. Common SLA terms include 99% (3.65 days a year, that is 87.6 hours), 99.9% (one year can be down 8.8 hours), 99.99% (one year can be down 53 minutes). Linode's SLA terms are 99.9% (8.8 hours a year), AWS's SLA terms are 99.95% (4.4 hours a year), Aliyun SLA terms are 99.9% (one year can be down 8.8 hours), and Sheng has no commitment to SLAs. In general, higher SLA terms mean higher hardware, software, and human resource inputs. So far, there have been no products and technologies that can guarantee 100% SLA terms.
Other
Successful use of cloud computing requires hardware and topology, software architecture, application scenarios, capacity planning, service level agreements, and more. As we've said earlier, cloud computing like AWS is more of a new concept of managing and allocating computing resources than a new technology. Using a cloud-like service similar to AWS requires users to understand some basic concepts, and often requires some modification of the application to achieve the best results. Successful cloud computing requires service providers and service consumers to prepare for cloud computing. If you just want to use the old way of using cloud computing, it's almost impossible to get ahead.