"Philosophy" and "green" in cloud computing data center

Source: Internet
Author: User
Keywords Server that is data center in split growth rate

China IDC Circle July 17: Using cloud computing is to deliver information processing tasks to the data center. Most of the next generation data centers adopt virtualization technology, dividing a physical hardware server into multiple logical virtual machines, which can perform multiple information processing tasks at the same time, which greatly improves the utilization ratio of expensive scarce computing resources. People of my age are familiar with it. The supreme teachings of the great man of philosophy about everything can be divided into two (1957 Mao Zedong in the "dialectical method of unity within the party" clearly pointed out: "In two, this is a common phenomenon, this is dialectics", "Mao Zedong Anthology" 5th Volume No. 498). So the next generation of data Center server virtualization approach is to the server in two (in fact, is divided into N). The great man asserts the universality of the dichotomy, applying to today's server virtualization scenario: dividing a physical server into multiple virtual machines This is a universally universal approach to the information processing tasks that we typically encounter in most situations, In most cases, the nature and scale of the tasks to be addressed need not be considered in particular. Indeed, in the past ten or twenty years, our general day-to-day business of information processing tasks in terms of volume and size of growth and GDP growth, there should be an approximate positive proportion of the corresponding relationship. The average annual growth rate of 10% is quite high. However, it is well known that the growth rate of IT equipment processing information capacity in the same period is much higher than GDP. This is why a virtual partition of a physical server is common and universal in two ways. General day-to-day business information processing tasks, even if it is relatively large-scale, to share a more tenant way to use a physical server, there should be no problem.

However, in recent years, some types of data have grown far more than GDP growth. From 2006 to the end of this year, global data volumes will increase 6 times-fold, according to a forecast published by IDC a few years ago. By the end of 2010, approximately 70% of the data was generated by individuals, and at least 85% of the data would be managed by various organizations, primarily to manage data in terms of security, privacy, reliability, and compliance (while nearly 70% of the digital Universe'll be generated by individuals by 2010,organizations'll be responsible for the security,privacy,reliability and Compliance of at least 85% of the information). IDC continues to emphasize that unstructured data occupies more than 95% of this data (over the digital universe is unstructured data). The most typical of these unstructured data is generated from the application of Network 2.0 and large-scale mobile communications. So the ability to store, process and analyze this 2.0 data must also have a corresponding high rate of growth. Using the web search engine content is a typical application of 2.0 data for large-scale processing. For this kind of data processing task, using only one server to deal with has been powerless, not to mention a server divided into multiple virtual machines to handle many of these tasks. So for this kind of data processing problem, the split method is no longer applicable, combined is the main point, namely: How to combine multiple servers to solve a problem. In recent years, the most popular map-reduce algorithm is to connect multiple servers, so that they work together to solve a large-scale data processing task. So map-reduce can be seen as a "two (many)" of the servers in the datacenter. As a single problem, the map-reduce algorithm usually does not run on a virtualized computing or storage platform. In fact, Map-reduce's idea of "moving computing near the data" (moving computation to the) implies a tightly coupled computational storage architecture: A map computation step of a data processing problem is sent in parallel to many map brought node platforms, Each node uses a tightly coupled storage architecture corresponding to the CPU and local disk one by one, which writes the intermediate results of the processing to the local disk.

We know that the servers in the datacenter do not use virtualization technology, so the utilization of computing resources is very low. Map-reduce because does not run on the virtualization platform, therefore in the computation and the storage resources utilization and the energy-saving green environmental protection aspect efficiency is very low. For example, it is a difficult task to implement dynamic balancing of a group of non-virtualization worker nodes. We note that web search engines have a significant carbon footprint. Someone has done the statistics: two times Google searches the data center server to consume electricity to boil a pot of water. How to realize the green solution to the problem of merging in the data center, such as running on the virtual platform with dynamic performance of the Map-reduce, is a very meaningful research topic.

China's IDC Circle July 17 reported: Using cloud computing is to deliver information processing tasks to the data center to deal with. Most of the next generation data centers adopt virtualization technology, dividing a physical hardware server into multiple logical virtual machines, which can perform multiple information processing tasks at the same time, which greatly improves the utilization ratio of expensive scarce computing resources. People of my age are familiar with it. The supreme teachings of the great man of philosophy about everything can be divided into two (1957 Mao Zedong in the "dialectical method of unity within the party" clearly pointed out: "In two, this is a common phenomenon, this is dialectics", "Mao Zedong Anthology" 5th Volume No. 498). So the next generation of data Center server virtualization approach is to the server in two (in fact, is divided into N). The great man asserts the universality of the dichotomy, applying to today's server virtualization scenario: dividing a physical server into multiple virtual machines This is a universally universal approach to the information processing tasks that we typically encounter in most situations, In most cases, the nature and scale of the tasks to be addressed need not be considered in particular. Indeed, in the past ten or twenty years, our general day-to-day business of information processing tasks in terms of volume and size of growth and GDP growth, there should be an approximate positive proportion of the corresponding relationship. The average annual growth rate of 10% is quite high. However, it is well known that the growth rate of IT equipment processing information capacity in the same period is much higher than GDP. This is why a virtual partition of a physical server is common and universal in two ways. General day-to-day business information processing tasks, even if it is relatively large-scale, to share a more tenant way to use a physical server, there should be no problem.

However, in recent years, some types of data have grown far more than GDP growth. From 2006 to the end of this year, global data volumes will increase 6 times-fold, according to a forecast published by IDC a few years ago. By the end of 2010, approximately 70% of the data was generated by individuals, and at least 85% of the data would be managed by various organizations, primarily to manage data in terms of security, privacy, reliability, and compliance (while nearly 70% of the digital Universe'll be generated by individuals by 2010,organizations'll be responsible for the security,privacy,reliability and Compliance of at least 85% of the information). IDC continues to emphasize that unstructured data occupies more than 95% of this data (over the digital universe is unstructured data). The most typical of these unstructured data is generated from the application of Network 2.0 and large-scale mobile communications. So the ability to store, process and analyze this 2.0 data must also have a corresponding high rate of growth. Using the web search engine content is a typical application of 2.0 data for large-scale processing. For this kind of data processing task, using only one server to deal with has been powerless, not to mention a server divided into multiple virtual machines to handle many of these tasks. So for this kind of data processing problem, the split method is no longer applicable, combined is the main point, namely: How to combine multiple servers to solve a problem. In recent years, the most popular map-reduce algorithm is to connect multiple servers, so that they work together to solve a large-scale data processing task. So map-reduce can be seen as a "two (many)" of the servers in the datacenter. As a single problem, the map-reduce algorithm usually does not run on a virtualized computing or storage platform. In fact, Map-reduce's idea of "moving computing near the data" (moving computation to the) implies a tightly coupled computational storage architecture: A map computation step of a data processing problem is sent in parallel to many map brought node platforms, Each node uses a tightly coupled storage architecture corresponding to the CPU and local disk one by one, which writes the intermediate results of the processing to the local disk.

We know that the servers in the datacenter do not use virtualization technology, so the utilization of computing resources is very low. Map-reduce because does not run on the virtualization platform, therefore in the computation and the storage resources utilization and the energy-saving green environmental protection aspect efficiency is very low. For example, it is a difficult task to implement dynamic balancing of a group of non-virtualization worker nodes. We note that web search engines have a significant carbon footprint. Someone has done the statistics: two times Google searches the data center server to consume electricity to boil a pot of water. How to realize the green solution to the problem of merging in the data center, such as running on the virtual platform with dynamic performance of the Map-reduce, is a very meaningful research topic.

(Responsible editor: Duqing first)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.