Cloud computing evolution and challenges

Last Update:2018-09-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is about the evolution and challenges of cloud computing. [IT168 Information] Economic development has led to the emergence of software and computing power service infrastructure, commonly known as cloud services or cloud computing. It is an emerging approach to sharing infrastructure that allows users to access their applications from connected devices from anywhere. Large pools of systems can be connected together to provide a variety of IT services. A number of factors are driving demand for such environments, including connected devices, real-time data streams, adoption of SOA, and the dramatic growth of Web 2.0 applications such as search, open collaboration, social networking, and mobile commerce. In addition, the increased performance of digital components has greatly increased the scale of the IT environment, further strengthening the need for a unified cloud management. There is no doubt that cloud computing already has a bright future.

Evolution of cloud computing

Currently, cloud computing is a hot topic in the IT industry. But it is not a revolutionary new development, but the result of the continuous evolution of data management technology.

At the end of the last century, distributed computing, parallel computing (Parallel Computing) and grid computing (Grid Computing) were quite mature. They are the technical foundation for the development of cloud computing.

In the late 1980s, the emergence of a large number of systems to solve a single problem (usually a scientific problem) began as a concept of grid computing, which in turn led to the development of cloud computing. The focus of grid computing is on the ability to move workloads to the location of the required computing resources, which in most cases are remote and continuously available. Typically, a grid is a server cluster, and large tasks can be split into multiple small tasks to run in parallel on those servers. From this perspective, we can actually think of the grid as just a virtual server. The grid also requires the application to conform to the interface standard of the grid software.

Public computing and SAAS (software as a service) can be seen as two forms of early cloud computing. Cloud computing now includes not only these two forms, but also other forms such as network services, platform as a service, and MSP (Management Service Provider).

In the 1990s, the concept of virtualization has expanded from virtual servers to higher levels of abstraction, starting with virtual platforms and then virtual applications. Public computing uses the cluster as a virtual platform and uses a measurable business model for calculations. Recently, SAAS (Software as a Service) has elevated virtualization to the level of the application, and the business model it uses is not charged for the resources it consumes, but rather for the value of the application provided to the subscriber. This type of service passes programs to thousands of users through a browser. In the eyes of the user, this will save you money on server and software licensing; from a vendor perspective, it is enough to maintain only one program, which can reduce costs. Salesforce.com is by far the most famous of these services. SAAS is more commonly used in human resource management procedures. Google Apps and Zoho Office are similar services.

The concept of cloud computing stems from the utility computing and SAAS concepts. The advantage of "cloud" lies in its infrastructure management. The maturity and continuous improvement of virtualization technology provide powerful support for this management, enabling the "cloud" to automatically deploy, rebuild the image, rebalance the workload, monitor and Process change requests systematically to manage and better utilize the underlying resources.

Cloud computing challenge

As an emerging technology that is expected to significantly reduce costs, cloud computing is increasingly sought after by a number of companies. But at the same time, a series of new and challenging problems have emerged.

First, a cross-cutting issue in cloud computing is that vendors have to weigh the cost of functionality and development. At present, the early cloud computing provides more APIs than the traditional database system. They only provide a minimal query language and limited consistency guarantees. This puts more programming burden on development, but allows service providers to provide more expected services and service level agreements, which is difficult to achieve for a full-featured SQL database. On the basis of the existing cloud computing, in order to achieve more complete functions with only minor changes, we need more experience and more work.

Second, manageability is extremely important in cloud computing, which also brings new challenges. Compared with traditional systems, the management of cloud computing is more complicated due to limited human intervention, large changes in workload, and a variety of shared devices. In most cases, there are no database administrators and system administrators who assist with cloud-based application development. Even a single user's load can change dramatically over time. For a customer who occasionally uses resources that are orders of magnitude higher than usual, the scalable supply of cloud computing is economical. It is difficult to tune the mixed load, but tuning in this case is inevitable. At the same time, service tuning mainly depends on the sharing mode of shared devices. For example, Amazon's EC2 uses a virtual machine at the hardware level as a programming interface. Salesforce.com, on the other hand, implements a "multi-tenant" virtual machine with multiple independent modes. Other virtual solutions are also possible. Under the platform above the load, each solution has different visibility and different ability to control each other. These changes require us to reconsider the traditional roles and responsibilities of cross-layer resource management.

In the late 1990s, research scholars began to study self-management techniques. The need for manageability has accelerated the development of this technology. Cloud computing systems require adaptive online technology, which in turn promotes the development of disruptive adaptive methods for new architectures and APIs in the system, including the flexibility to distinguish between traditional SQL languages and transactional semantics.

Then, the sheer size of cloud computing has also brought new challenges. Existing SQL databases cannot simply handle the thousands of data placed in the cloud. In terms of storage, it is uncertain whether to implement technology with different transactions, or use different storage technologies, or both to solve some restrictive problems. On this issue, there are currently many proposals in the database field. Existing cloud computing has begun to explore some simple practical methods, but we still need to do more work to integrate the good ideas in the existing cloud computer culture. In terms of query processing and optimization, if it takes a long time to search for a planned space involving thousands of processing, then this is not feasible, so you need to set limits on the planning space or search. Finally, how to program in the cloud environment is still unclear. We need to understand more about the realities of cloud computing (including performance limitations and application requirements) to help design.

In addition, in cloud infrastructure, physical resource sharing brings new data security and privacy crises. They can no longer rely on the physical boundaries of machines or networks to be protected. So cloud computing offers a wealth of opportunities for existing work in synthesis and acceleration. The key to success is whether we can accurately target the cloud's application scenarios and accurately grasp the actual trends of service providers and customers.

Finally, as cloud computing becomes more popular, new application scenarios are expected to emerge, which will also bring new challenges. For example, we predict that there will be special services that require preloading large data sets (like stock prices, weather history data, and online searches). Getting useful information from both private and public environments is attracting more and more attention. This creates a new problem: we need to extract useful information from heterogeneous data that is structured, semi-structured, or unstructured. At the same time, this also indicates that cross-cloud services will inevitably emerge. In the scientific data grid computing, this problem has become very popular. Even in a single discipline, there will be a large number of shared data servers located in different geographic locations. In general, the same is true for most companies. And the joint cloud architecture will not reduce the difficulty of only increasing the problem.

Cloud computing instance

Finally, let's do a simple analysis of the two companies that lead the cloud computing trend, Google and IBM's cloud computing platform.

In a short time, Google's position in cloud computing is still unshakable, and its open platform reflects the essence of the cloud computing model. Most of the basic software required for Google's cloud computing services is open source, which means users are free to get those code and modify it. Since 2003, Google has published papers in the top conferences and magazines in the field of computer systems research for several years, revealing its internal distributed data processing methods and showing the core technologies of cloud computing used by the outside world. Google’s cloud computing technology is actually tailored to Google’s specific web applications. In view of the large scale of internal network data, Google proposed a set of infrastructure based on distributed parallel clustering, using software capabilities to deal with node failures that often occur in clusters. The cloud infrastructure model used by Google consists of four separate and tightly coupled systems. Including Google's file system Google File System built on the cluster, Map/Reduce mode (mapping/simplified programming mode) for the characteristics of Google applications, distributed lock mechanism Chubby and Google developed models simplified large-scale Distributed database BigTable.

GFS (Google File System) is designed for the Google app itself. A GFS cluster consists of a primary server and multiple block servers that can be accessed by multiple clients.

In order to make people unfamiliar with distributed systems have the opportunity to build on a large-scale cluster, Google has also designed and implemented a large-scale data processing programming specification Map/Reduce system. In this way, non-distributed professional programmers can also write applications for large-scale clusters without worrying about cluster reliability, scalability, and so on. Application writers only need to focus on the application itself, and the processing of the cluster is handled by the platform. Map/Reduce achieves reliability by distributing large-scale operations on the data set to each node on the network; each node periodically reports back the completed work and status updates. If a node remains silent for more than a predetermined time interval, the primary node (like the primary server in the Google File System) records that the node status is dead and sends the data assigned to that node to another node. Each operation uses an atomic operation of the named file to ensure that no conflicts between parallel threads occur; when the file is renamed, the system may copy them to another name other than the task name.

The third cloud computing platform is Google's BigTable system for extending database systems to distributed platforms. In order to handle a large amount of formatting and semi-formatted data inside Google, Google built a large-scale database system BigTable with weak consistency requirements. In addition to these three parts, Google has also established a distributed program scheduler, distributed lock service and other related cloud computing service platforms.

Blue Giant IBM also made a big bet on this and named it the "Blue Cloud" program. IBM has all the advantages of developing a cloud computing business: application servers, storage, management software, middleware, etc., so IBM naturally will not let such a fame opportunity. Recently, the idea of a “new enterprise data center” was introduced, which combines the advantages of a web-centric cloud computing model with the current enterprise data center. An article from the China Cloud Computing Network gives the "New Enterprise Data Center" model and its infrastructure services framework.

The new enterprise data center will be the center of virtualization and efficient management. It will use some of the tools and technologies used in the Web-centric cloud and be generalized for adoption by a wider range of customers, as well as enhancements. Support for secure transactional workloads. Through an efficient and shared infrastructure, organizations can respond quickly to new business needs, parse large amounts of information in real time, and make informed business decisions based on real-time data. The new enterprise data center is a new evolutionary model that provides an efficient and dynamic new approach that helps align IT and business goals. As shown in Figure 3, from a high-level architectural perspective, the infrastructure services of the new enterprise data center can be logically divided into different levels. The physical hardware layer is virtualized to provide a flexible and adaptable platform for increased resource utilization. The next two layers are the virtualization environment layer and management layers, which are key to the new enterprise data center infrastructure services. By combining these two layers, you can ensure that resources within the data center are effectively managed and quickly deployed and configured. In addition, the new enterprise data center is designed to handle mixed mode workloads.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More