A thorough interpretation of "It Tomorrow Star" cloud computing

Source: Internet
Author: User
Keywords Cloud computing providing can
Tags .mall added analysis application application services applications browser business

Google's proposed distributed computing technology makes it easy for developers to develop global application services that automate the management of a large number of standardized (heterogeneous) computer communications, task assignments, and distributed storage. The

Cloud derives from a decentralized parallel operation, but is better at data operations than the grid

Cloud technology can be considered a sub-set of grid technologies, both of which are designed to hide the complexity of the system so that users don't need to know how to operate within the system.

The parties follow up on the Google Promotion cloud service, but define different

Different operators have different definitions of cloud computing, but the concept of "cloud model" is broadly interlinked. The use of the Internet to provide application services, so that users can use through the browser, do not need to know where the server, how the internal operation, is called cloud computing services.

Yahoo uses Hadoop to process 4 PB of Web pages

Yahoo architecture engineer Vivek Ratan said: "The current use of the Hadoop framework of cloud computing, the largest task is the Yahoo used to build a Web page index database operation, At the same time using the processor core of 10,000 Linux platforms, processing 1 Gigabit Web links, from 4PB data, to calculate 300TB index data.

Trends Global Mobile challenges with cloud technology to address a large amount of data

Trend technology began using parallel computing grid technology 4 years ago to provide cloud services.

After using the cloud platform, researchers in both the United States and China can share data about virus analysis through the same set of computing platforms.

Cloud Technology lowers SaaS threshold, small companies can do business all over the world

Yu Xiaoxian, Deputy Director of Institute of Information and Communications, said that even if the enterprise does not have sufficient capacity or financial resources to establish its own room and network structure, Can leverage Amazon or Google's information architecture to provide global SaaS (Software-as-a-service) services. The

Cloud Core technology MapReduce

Key technology for cloud computing MapReduce is a problem-solving program development model and a way for developers to disassemble problems. It was first proposed by Google and later used in Open-source cloud technology Hadoop.

Clouds derive from parallel operations, but are better at data operations than grids

Recently, China Institute of Grid Computing team moderator Lin wherever he went, was asked one thing, from the Singapore Academic Forum, to the South Taiwan Academic Circle Exchange occasions, everyone asked him: "google Talk about cloud computing (Cloud Computing), and grid operations (grids Computing) What's the difference? 」

Clouds derive from scattered parallel operations, but better at data operations than grids

"Cloud technology can be regarded as a child of grid technology," Lin said: "The same goal is to hide the complexity of the system, so that users need to use without understanding how the internal workings of the system." 」

Lin that grid technology covers cloud technology, but that the grid can handle more complex problems, while cloud technology can be seen as a commercial result of grid technology.

"Cloud computing is from the grid technology decentralized parallel computing technology and concepts developed, the industry to use a term to packaging the original technology, but the use of metaphor is different." He added: "This is important for the computer industry because it helps to promote public understanding of the technology." 」

Similarly, the High Speed Network and computing center business and Planning Management Group program host Huang Weishing that, in general, cloud computing and Grid computing philosophy is the same, he said: "For users, do not need to know what the server is?" Where is it? Is the need to throw in, you will get results, this is the idea of cloud computing, but also the idea of grid operations. 」

Further analysis of the differences between the two, Huang Weishing said: "Although cloud computing from the parallel computing technology, not out of the philosophy of Grid computing, but cloud computing is more focused on the processing of data." 」

A small amount of data processing, so that cloud computing development is different from the grid operation of the implementation method

From the data type of processing, Huang Weishing that: "Cloud computing is suitable for the task, most of the data processing frequency is high, and each time to deal with the small amount of information." 」

Cloud computing vs. Grid computing

Cloud

Grid operations

Main catalyst

Information providers (such as Google, Yahoo, IBM, Amazon, etc.)

Academic institutions (such as CERN, Academia Sinica, national high Speed Network and computing center)

Degree of standardization

Without standardization, the technical framework adopted by each family is also different.

There are standardized protocols and trust mechanisms

Open Source Range

Part of the open source, there is the open source Hadoop framework, but Google GFs and database system bigtable is not open source.

Completely open source

Network Domain restrictions

Intranet domain

Across the enterprise, across the management network domain

Hardware that can be supported by a single operation cluster

Personal computers of the same standard specification (e.g. x86 processor, hard disk, 4GB memory, Linux, etc.)

Can be mixed heterogeneous server (different processor, different operating system, different compiler version, etc.)

Data features that are good at handling

The single operation data is small (can be performed on a single personal computer), but it needs to be repeated with a large number of processing times.

The application of large amount of data in single operation. For example, a single count of GB satellite signal analysis.



such as web search work, each operation only needs to compare to a page, compared to the size of the data may not be more than 1MB, but, the world has billions of pages, to the full comparison, the total amount of data compared to the volume is very considerable. Huang Weishing that this feature is not the same type of grid computing, that grid computing is good for solving scientific research, such as analyzing satellite return information, and each time the message file is analyzed as many gigabytes.

Even though cloud computing is the philosophy of parallel computing with grid technology, because cloud computing is more suitable for tasks that perform smaller amounts of single data processing, Huang Weishing that cloud computing is different from grid operations in practice.

He further explained: "For example, search for Web pages, each time compared to the page, in fact, the file is not large, the need to spend a small amount of processor resources, so a large number of personal computers can be used to perform web search operations, but to use personal computer to set up grid operations is more difficult, Because grid operations require large processing resources.

So the real difference is that cloud computing can combine a large number of personal computers to provide services, while grid operations rely on highly efficient computers that provide a large number of computational resources. The ideal of the

Grid technology is to allow any server to join an operational grid to provide a large amount of computing, so the technical difficulty is to solve heterogeneous problems of different servers, operating systems, and even the version of the program compiler.

However, Google's cloud computing practices, for example, use a large number of the same level of personal computer servers to perform cloud computing programs, so there is no need to deal with heterogeneity problems, can simplify the parallel operation of the system architecture, easier to coordinate information transfer between servers, Make the overall performance of distributed processing better. Many of Google's products or services, such as Google search, Gmail, Google Maps, Google Docs, etc., all use the cloud computing technology, with a large number of Low-cost server computing resources to meet the needs of a large number of users.

Cloud noun interpretation

Cloud computing (Cloud Computing): Google's proposed distributed computing technology makes it easy for developers to develop global application services, and cloud computing technology automates the management of a large number of standardized (heterogeneous) computer communications, Task allocation and distributed storage.

Grid operations (Grid Computing): On the network, through the Standardization of Protocol and trust mechanism, the integration of heterogeneous servers across the network domain, the establishment of computing cluster system to share computing resources, storage resources and so on. The

Service is in the Cloud (In-the-cloud) or cloud service (Cloud SerVice): The supplier provides services via the Internet, which users can use only through the browser, without knowing how the supplier's server works.

MapReduce mode: Google uses key technologies in cloud computing to allow developers to develop a large number of data handlers. The data is cut into unrelated chunks through a map program, allocated to a large number of computer processing, and then remitted by the reduce program to output the results of the developer's needs.

Hadoop: The Open-source Cloud computing framework, developed using Java, is also a framework for using Google's cloud computing technology, but the distributed file system used is different from Google's. 2006 Yahoo became the main contributor and user of the program.

The parties follow Google's promotion cloud service, but define different





from the beginning of last year, Google to further vigorously promote cloud computing, Google Global Vice President Lee Kai-Fu said: "Cloud computing is Google's most important key technology, but also the future trend of network applications." "He believes that in addition to users can use Google Cloud services, regardless of the size of the enterprise, can also use Google's services to meet the needs of the enterprise's internal information applications, or use the platform provided by Google, such as Google App Engine, To serve the world's users, using the cloud computing environment provided by Google.





the success of Google's own Web services, and its spread to cloud computing, has attracted other companies such as Yahoo, IBM, Microsoft, HP, and has also said that it has cloud computing products and technologies, or will provide services in the future using cloud computing.





However, the definition of cloud computing differs from industry to person. Only in a broad sense is the concept of "cloud model" (Cloud model) interlinked. Regardless of the type of service, or the information architecture that performs the service, the application service is provided through the Internet, which allows users to use it through the browser, without knowing where the server is, how it works, what is called cloud computing, and the technology behind it. In this broad definition, other operators also refer to such applications as cloud services (Cloud service) or services in the Cloud (In-the-cloud).





Amazon leverages virtualization technology to provide cloud computing services





Amazon offers EC2 (elastic Compute Cloud) and S3 (simple Storage Service) services that are different from the way Google Cloud services are implemented.





EC2 uses Xen virtualization technology to provide a virtual execution environment (virtual instance instance, or virtual machines), allowing the enterprise that leases instance to execute its own application. Amazon offers a variety of instance, such as a very large instance combination, which includes gigabytes of memory, 8 EC2 units (similar to 4 dual-core virtual processors), 1690 GB storage space, 64-bit platforms.





Enterprises only need to pack their own operating system, Web server and applications into an image file, upload to the EC2 server, call EC2 provided instructions to execute image files to start the service, as if you have an entity server, An enterprise can control the operating system that is executed in instance.





EC2 is an environment that provides image file execution, but cannot preserve data after it is finished, so Amazon also provides a distributed file system S3 for the enterprise to save the output of EC2 operations. Amazon also provides a number of ready-made image examples, the common operating systems, Web servers, database systems are packaged into a set of execution environment template, the enterprise can put these image files in the leased instance execution, And then upload their own web applications to their own instance, you can provide services, do not need to spend time on a variety of system installation and setup.





In addition, Amazon can also enable enterprises in the implementation of the application process, dynamically adjust the operation of leased instance specifications, such as the original leased instance operation speed full load, or not enough bandwidth, but also dynamically expand the operational resources available to use, Or add new instance to share the flow.





The cloud computing provided by various industries, there are different architectures behind them





Amazon through Xen virtualization technology, so that enterprises do not have to deal with the physical server dimension, but to perform their own services, in general, is also a hidden computational complexity of cloud services. This is completely different from the cloud computing that Google uses to implement technology.





in fact, other companies such as Microsoft and Yahoo are also developing their own cloud computing platforms, and the theories and implementation architectures are not the same.





But it wasn't until last year that Google started publishing these cloud computing technologies, and in fact, as early as 2004, Jeffrey Dean, a senior engineer who joined Google's infrastructure, and colleagues at OSDI (keyboard-based Bae Design and Implementation) Seminar, published the core technology model of Google Cloud computing MapReduce, and Google's use of these technologies. Jeffrey Dean in his report points out that cloud computing is inspired by the list language and functional programming (functional lauguage) of parallel operations, combining the original map and reduce two programming concepts into the new model of MapReduce, Google discovered the MapReduce operation model, which is ideal for distributed operations that handle large amounts of data.





Open Source community uses Google experience to develop Hadoop framework





Google published MapReduce, the 2004 open source community also used Java implementation of a set of MapReduce technology framework Hadoop, so that Java developers can easily write cloud computing applications. With the Hadoop framework developer Doug Cutting joined Yahoo,yahoo in 2006 as the main contributor and user of Hadoop, many of Yahoo's services, such as Web search, are already the cloud computing developed using the Hadoop framework.





currently Hadoop is the only open source cloud computing framework, though slightly different from Google's cloud computing technology, but the core design concept comes from Google's MapReduce model and distributed archival architecture, Google also has a number of engineers involved in the development of the Hadoop program, such as Christophe Bisciglia, an engineer who launched the Google Cloud Computing program, and is also involved in the development of Hadoop.





Google Cloud Computing Architecture

MapReduce
Mode

Big Table
Library mode

GFS (Google file system) archive

Yahoo uses Hadoop to process 4 PB Web pages





Yahoo Architecture engineer Vivek Ratan is also one of the Hadoop framework developers, he said: "The current use of the Hadoop framework of cloud computing, the largest task is the Yahoo used to build a Web page index database operation, while using 10,000 Linux platform processor core, Processing 1-Gigabit web links, from 4PB data, to calculate the 300TB index data. "He added:" With the use of machine equipment, using Hadoop to handle the same tasks, compared to the original use of the cluster operation, save 1/3 of the time. 」




The advantages of
cloud computing to handle large amounts of data have also attracted a lot of enterprise input. IBM, for example, announced last year that it was working with Google on the mainland in a blue Cloud program (Cloud), using the Hadoop framework to handle scientific computing or providing cloud computing services.





compared to the architecture of traditional supercomputers or mainframes, Vivek ratan that the system architecture of cloud computing is a completely different design paradigm, that traditional mainframe is the design architecture of vertical extensions (Vertical scaling), and that the cloud computing, like Hadoop or Google, is to adopt the horizontal expansion (horizontal scaling) design architecture.





vertical expansion, is the ability to continuously improve the computing power of a single server, for example, by trying to equip a single server with more computing cores to increase the amount of data that an application can handle, while a horizontal expansion is increasing the number of servers, increasing the amount of data that the application can handle, Without the need to improve the computing power of a single server. So, in the case of Hadoop with a horizontal expansion design, with more and more users of application services, more and more data to be processed, only new servers need to be constantly added, without the need to modify the original application code.





Vivek further points to the two advantages of horizontal expansion, saying: "Because you can improve the computing power by massively expanding the server, you don't need to use a very expensive server, and PC-level computers are sufficient." "It takes tens of millions of dollars to buy a large mainframe, but businesses can buy hundreds of PCs at the same cost, and through Hadoop integration, they can provide computing power over a single mainframe, in other words, lower costs and higher computational efficiencies."





Another greater advantage is the ability to improve system fault tolerance. Although the ability to compute a single mainframe is high, it is like putting all the eggs in one basket, and once the machine is in place, the applications executed by the host will be completely shut down, unable to provide services, and even with a standby system, it will take some time to convert the service. In the framework of Hadoop, it is through a master mainframe to cut the program into many parts, allocated to many computers to perform, even if there are several computers, the master host can immediately put the required operation of the part to the idle computer execution, the overall application services will not be interrupted.





Vivek says that in the operation of a single task, even if there are one-tenth of computers, the operation can continue to execute, he said: "Although the performance will be slow, but will not be interrupted." "Network administrators only need to restore the backup files of the operating environment to the new machine, they can join the Hadoop computing environment to provide services soon."





reducing costs is only a short-term benefit, speed will lead to innovative applications





cost reduction is the most obvious benefit of cloud computing, says Jianli of the Google Institute of Engineering, said: "Large enterprises are gradually experiencing the increase in the cost of a single server, storage and maintenance, as well as the management of the increase in human resources, the current interested enterprises, most of the first to see the benefits of cloud He added: "For a while, these companies will see the speed value of cloud computing." 」





Jianli that many enterprises in the past to provide network applications, often subject to technical constraints, concerns that the operational environment of the enterprise can not be competent for a large number of data processing and a large number of users online, so the available services are limited. However, through cloud computing, a large amount of data can be processed at a lower cost to provide users with almost real-time information services. Jianli said: "Speed is the key to produce applications, as the amount of data, the more you can feel the difference in speed." 」





trend technology is to use the speed of cloud services to provide new network security services. The trend has been to use parallel computing techniques about 4 years ago to provide business users with the services to filter Web content. As the need for Web content checking becomes more and more high, enterprises or general users want the trend of technology security protection, can filter like phishing sites or malicious links to the content of Web pages, to ensure that users on the Internet security.





But the virus evolves faster and faster, the Web page malicious procedures increasingly complex, trend technology to find the need for huge computational ability, just enough to analyze 4.7 billion of pages a day of content, and virus code update speed must be updated at any time, to allow users to protect no empty window period.




"When the outside environment changes rapidly, users want security vendors to provide protection, so the speed of product updates is important," said Yang Xining, project manager at
Trend Research and Development department. 」





Plus, the trend of the research team scattered in China, the United States and Japan, the daily need to analyze the amount of data up to terabytes level, to move data across the country, the need for online cost is very high, slow, and also affect the speed of product updates.





Yang Xining pointed out that the trend to use open source cloud technology and grid technology, the service to the Cloud (In-the-cloud), not only to enable the service to respond quickly, the development of the solution quickly, but also to solve a large number of data to the problem of geographical operation. She said: "In the past to run a day of analysis tasks, now a few seconds to get results." If the user has to wait a day to obtain security protection, it is too late. For the company, the cloud services are not. 」





trend Technology has long been aware of the power of cloud computing, when the relevant technology is still in the process of development, has begun to invest. As Google and Yahoo have been promoting cloud computing this year, they are also working with many universities to train the development workforce of cloud computing. Jianli that the time is ripe for companies to use cloud computing and that entry thresholds will be lower and higher, he said: "Future applications will be related to cloud computing, and for companies that want to innovate, it's possible to start thinking about the technology." 」





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.