IBM launched the Blue Cloud computing platform on November 15, 2007, offering customers the cloud computing platform to buy. It includes a range of cloud computing products that allow computing to run in a network-like environment by architecting a distributed, globally accessible resource structure that is not limited to local machines or remote server farms (i.e., server clusters).
Through the IBM technical White Paper, we can glimpse the inner structure of the blue cloud computing platform. The blue Cloud is built on the expertise of IBM's large-scale computing industry, based on open standards and open source software supported by IBM software, System technology and services. Simply put, the "Blue Cloud" is based on the cloud infrastructure of the IBM Almaden Research Center (Almaden), including Xen and POWERVM virtualization, Linux operating system images, and Hadoop file systems and parallel builds. The blue cloud is supported by IBM Tivoli software to ensure optimal performance based on requirements by managing servers. This includes providing a seamless experience for customers through software that can allocate resources in real time across multiple servers, accelerating performance and ensuring stability in the most demanding environments. IBM's newly released Blue Cloud program can help users build their cloud computing environments. It integrates Tivoli, DB2, WebSphere, and hardware products (currently x86 blades) to build a distributed, globally accessible resource structure for the enterprise. According to IBM's plan, the first "blue cloud" product to support the power and x86 processor Blade Server system will be launched in 2008, with a cloud environment based on system Z "mainframe" and a cloud environment based on high-density rack clusters.
In the IBM Cloud Computing White paper, we can see the following Blue Cloud computing platform configuration.
Figure 4 illustrates the high-level architecture of Blue cloud computing. As you can see, the Blue Cloud computing platform consists of a data center: IBM Tivoli Deployment Management software (Tivoli Provisioning Manager), IBM Tivoli Monitoring Software (IBM Tivoli monitoring), IBM The WebSphere Application server, the IBM DB2 database, and some virtualized components. The architecture in the diagram mainly describes the background architecture of cloud computing and does not involve the user interface of the foreground.
There is nothing special about the hardware platform of the blue Cloud, but the software platform used by the blue cloud is different from the previous distributed platform, which is mainly embodied in the use of virtual machines and the deployment of Apache Hadoop for large-scale data processing software. Hadoop is a web developer based on Google's publicly available data, developed by the Hadoop file system similar to Google File system and the corresponding Map/reduce programming specification. We are also developing a chubby system similar to Google and the corresponding distributed database management system bigtable. Because Hadoop is open source, it can be modified directly by the user unit to suit the specific needs of the application. IBM's Blue Cloud offerings directly integrate Hadoop software into its own cloud computing platform.
Virtualization in the Blue Cloud
From the structure of the blue cloud we can also see that the software stack running on each node is a big difference from the traditional software stack because of the use of virtualization technology inside the blue cloud. The virtualization approach can be achieved at two levels in the cloud. One level is virtualization at the hardware level. Hardware-level virtualization can use IBM P-series servers to obtain hardware-logical partition LPARs. The CPU resources for logical partitions can be managed through IBM Enterprise workload Manager. This approach, coupled with resource allocation strategies in the actual use process, enables the appropriate allocation of resources to each logical partition. The logical partitioning of P series systems is 1/10 central processing units (CPUs).
Another level of virtualization is available through software, and Xen virtualization software is used in the Blue cloud computing platform. Xen is also an open-source virtualization software that can run another operating system on the basis of existing Linux, and flexibly deploy and operate the software through virtual machines.
The management of cloud computing resources through virtual machines has special benefits. Because the virtual machine is a kind of special software, can completely simulate the execution of the hardware, so can run the operating system above, and thus can retain a set of operating environment semantics. This allows the entire execution environment to be transported to other physical nodes in a packaged manner, thus isolating the execution environment from the physical environment and facilitating the deployment of the entire application module. In general, some good features can be obtained by applying virtualization technology to the cloud computing platform.
1. The management platform for cloud computing can dynamically position the computing platform on the required physical platform without stopping applications running on the virtual machine platform, which is more flexible than the process migration approach prior to the adoption of virtualization technology.
2. More efficient use of host resources, the multiple load is not very heavy virtual machine computing nodes merged into the same physical node, so that the idle physical nodes can be closed to save energy.
3. With the dynamic migration of virtual machines on different physical nodes, the load balancing performance independent of the application can be obtained. Because the virtual machine contains the entire virtualized operating system and the application environment, the migration takes place with the entire operating environment, with no application-independent purpose.
4. The deployment is also more flexible, that is, the virtual machine can be directly deployed to the physical computing platform.
In a nutshell, cloud computing platforms can achieve extremely flexible features through virtualization, and there are many limitations if you don't use virtualization.
Storage structure in blue cloud
The storage architecture in the Blue Cloud computing platform is also important for cloud computing, with both the operating system, the service program, and the user application data stored in the storage system. Cloud computing does not exclude any useful storage architecture, but it needs to be combined with application requirements for the best performance improvements. On the whole, cloud computing's storage architecture contains two different ways of clustering file systems like the Google File system and storage area network San based on block devices.
When designing the storage architecture of the cloud computing platform, it is not just about storage capacity. In fact, with the expansion of hard disk capacity and the falling price of hard disk, using the current disk technology, it is easy to get large disk capacity by using multiple disks. Compared to the capacity of the disk, the reading and writing speed of disk data is a more important problem in the storage of cloud computing platform. The speed of a single disk is very likely to limit the application's access to data, so in practice, you need to spread the data over multiple disks and read and write to multiple disks to achieve faster speeds. In the cloud computing platform, how data is placed is a very important issue, in the actual use of the process, you need to allocate data to multiple nodes of multiple disks. There are currently two ways to achieve this trend in storage technology, one is to use a clustered file system similar to Google file systems, the other is a block based storage Area Network San System.
The Google file system has already done a certain description in front of it. The IBM Blue Cloud computing platform uses its Open-source implementation of Hadoop HDFS (Hadoop distributed File System). This usage attaches the disk to the interior of the node and provides an external shared distributed file system space and redundancy at the file system level to improve reliability. In the appropriate distributed data processing mode, this method can improve the overall data processing efficiency. This architecture of the Google File system differs greatly from the San system.
San systems are also a choice of storage architecture for cloud computing platforms, and are also reflected in the blue Cloud Platform, where IBM also provides a platform for San to connect to the blue Cloud computing platform. Figure 5 is a schematic diagram of a SAN system.
As you can see in Figure 5, a SAN system is a storage-side network that builds a store of multiple storage devices into a single storage area network. The front-end hosts can access the back-end storage devices over the network. Also, because of the way a block device is accessed, it has nothing to do with the front-end operating system. There are several options available for San connectivity. One option is to use a fiber optic network that can operate fast fibre disks and is suitable for places with high performance and reliability requirements. Another option is to use Ethernet, iSCSI protocol, to run in a common LAN environment, thereby reducing costs. Because disk devices in a storage area network are not bound to a single host, but rather have a very flexible structure, it is possible for a host to access multiple disk devices to gain performance gains. In a storage area network, the virtualization engine is used to map logical devices to physical devices, and to manage the reading and writing of front-end hosts to back-end data. Therefore, the virtualization engine is a very important management module in the storage area network.
SAN Systems and Distributed file systems, such as the Google file system, are not competing systems, but are two options to choose from when building a cluster system. Where San systems are selected, for applications to read and write, there is also a need to provide an upper-level semantic interface to the application, where the file system needs to be built on top of the SAN. And Google file system is just a distributed filesystem, so it can be built on a SAN system. Overall, sans and distributed file systems can provide similar functionality, such as handling errors. How to use it or how it needs to be decided by an application built on the cloud platform.
Unlike Google, IBM does not provide externally accessible Web applications based on cloud computing. This is mainly because IBM is not a network company, but an IT service company. Of course, IBM's internal and future software services for its customers will be based on the architecture of cloud computing.