The way of HPC and cloud integration

Source: Internet
Author: User
Keywords Cloud

The management challenge of cloud and super scale

High performance computing, as the third largest scientific research method besides experiment and theory, has been paid more and more attention. High-performance computing even partly reflects the level of comprehensive national strength, countries have invested heavily in the construction of large supercomputers, top 500 records are constantly refreshed. However, the increase of the scale of the calculation has brought new challenges to the system management of High-performance computing.

1. In the case of fixed resources, the high Performance computing system job scheduling for multi-user use is to balance the utilization of systems and the quality of User Service, but it is difficult to guarantee that the scheduling target of short job waiting time will lead to low utilization, while the high utilization scheduling target may increase the waiting time of the user's job

2. With the popularization of High-performance computing applications, various applications from the underlying operating system, to parallel computing middleware (such as MPI) and the application software are different, for computing, network and storage and other physical layer hardware requirements are different, such as high-performance computing can be subdivided into computational-intensive, Throughput-intensive and data-intensive applications, how do job scheduling consider these constraints?

3. The architecture of High-performance computing is also evolving, with GPU and multi-core heterogeneous systems accelerating the performance of computing, but also making the job scheduling of High-performance computing more complex.

Cloud computing is a new service delivery model that allows users to obtain resources from service providers, such as applications, development environments, hardware platforms, virtual/physical servers, and services. All of these services are on demand, and users only need to pay for the resources they have already used or are using. Virtualization technology is a key technology in cloud computing, and virtualization technology can abstract a physical device into multiple logical devices. This pattern makes computing and other resource allocations more flexible, more reliable, and easier to scale and upgrade.

The flexible application model of cloud computing will bring new opportunities to solve the system management of high performance computing. Since the rise of cloud computing, an interesting question has been the focus on whether High-performance computing can be achieved on business cloud computing services. Researchers at the University of Texas at Austin compared the performance of a virtual cluster of Amazon EC2 compute nodes with a physical cluster, and the basic configuration is as follows:

They found through a variety of typical parallel test sets, for OpenMP-based shared memory parallel programs, the performance of the EC2 cluster has fallen by about 7%~21%, while MPI based distributed memory parallel program, EC2 cluster performance has dropped about 40%~1000%, The main cause of performance degradation is virtualization and the interconnection network. Virtualization is a major constraint to high-performance computing, and some hardware devices such as GPU and InfiniBand cannot be virtualized, except for performance degradation.

In addition, in order to pursue more computational capacity and capability, it may not be satisfied by the resources of a single super center, so it is necessary to dispatch the resources of multiple computing centers across the domain, which solves this problem better. However, cross domain interoperability between multiple clouds, the standard has not yet formed, basically remain in the concept.

We propose a new computational framework of high performance computing, Grid computing and cloud computing--high performance Elastic Computing (HPEC), which can manage and dispatch multiple cloud computing resources across domains, support GPU and multi-core heterogeneous computing environments, and users can flexibly apply and manage computing, storage and network resources. These resources can be virtualized or direct physical resources that support the rapid deployment of the high configured Computing Platform as a Service (Hpcpaas) multi-node cluster and the upper software, while supporting compute-intensive (MPI) and data-intensive ( Map/reduce) applications.

This paper will summarize the relevant research and application progress at home and abroad, discuss the architecture and key technical challenges of HPEC, and finally briefly introduce the preliminary research work of Shanghai Jiaotong University Network and Information Center on Hpec.

123 Next
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.