The main reason that cluster server obtains the widespread application is its high price-performance ratio, through the quantity superiority to compensate the node processing performance insufficiency. The rapid increase in the number of nodes inevitably led to fast growth in power consumption, and IDC (Analysys Data Corporation) data showed that China's total expenditure on power and cooling for servers in 2007 amounted to $1.9 billion trillion, In only two regions of Beijing and Guangdong, this part of the investment reached 3.2 and 360 million U.S. dollars, accounting for 35.8% of the country. According to statistics, China 2007 total power consumption of IT products in 30 billion to 50 billion degrees, the equivalent of the Three Gorges Hydropower station for a year of total electricity generation. The enormous energy consumption attracts the attention of all parties.
The number of cluster nodes is growing rapidly, but its use efficiency is not in the low level, in order to apply the most common X86 server cluster as an example, it is generally believed that its utilization rate is lower than 30%,IBM, the average utilization rate of Intel server is only 10%, which leads to a lot of waste of power resources.
Requirements for cluster power management
Power management belongs to the category of cluster infrastructure management, mainly concerned about two aspects, on the one hand, how to connect the actual load and power consumption of the cluster, on the other hand, how to minimize the total power consumption of the cluster without affecting the application, the ultimate aim is to realize the quantification and optimization of the power energy. Since the power consumption is analyzed from the cluster perspective, the absolute power of the single node is not involved, for example, replacing a more energy-efficient CPU for a node is not considered.
Power distribution of modern computer room
Cluster Server is generally installed in a dedicated room, the general use of elevated floors, standard cabinets, UPS (uninterruptible power Supply), precision air-conditioning and other equipment.
It can be seen that the server and cooling equipment in the computer room total power consumption of more than 80% of the share, it is obvious that the power consumption of cooling equipment and the server when the heat generated by the relevant, how to better cluster power management has become the key.
Based on the above ideas, the cluster power management can be summed up as follows several functions:
Heat distribution and cooling equipment control
From the analysis of the power distribution of the engine room above, it can be seen that the cooling system power consumption is second only to it equipment, but in fact this part of the power consumption is wasted. Figure one is the three-dimensional temperature distribution simulation of the existing computer room, from this can be seen, because the load in the room and time is uneven distribution, resulting in the distribution of temperature is uneven, there are hot spots (the red part of the picture) and cold spots (the blue part of the figure), is a typical unsteady thermodynamic system! The heat dissipation design of the existing room is generally based on the thermodynamic steady state system, resulting in a large amount of energy waste, and research shows that the effective refrigeration capacity is less than 50%. Therefore, it is an inevitable development direction to set up thermodynamic heat dissipation model, and to use real-time monitoring data and power allocation strategy based on cluster power consumption.
Development trend of cluster power management
Most of the devices in the existing data centers are managed separately, for example, disk array, server, UPS, air-conditioning and so on, I think its management mode will be two directions, on the one hand, according to the external environment changes and load fluctuations of the unified task scheduling and adjustment of various equipment to achieve a broader sense of the most optimal scheduling; , the granularity of management is finer, for example, the task of each node is adjusted, the frequency of each CPU core is adjusted separately, and the partition cooling is done.
Real-time monitoring and analysis of cluster power consumption
The monitoring of cluster power consumption is the basis of power management, and can be realized by two means: by adding power sensor (Power meter) to the server power module, using the board's Out-of-band Management module (BMC) to read directly. The advantage of this method is simple and direct, the disadvantage is that the precision is limited, the current can only reach +-10%, and the power sensor must be placed in the power supply AC end, because the blade server with a shared power can not detect the power of each blade, so can not support blade server. Another means is the server manufacturer to calibrate the server power under various load conditions after server production, when the user opens the monitoring software on the node, it can estimate the power of the node, as long as the factory can calibrate the load sample enough to achieve higher precision, and because of the combination of hardware and software, Both rack-and blade-servers are supported.
On the basis of accurate cluster real-time power monitoring, the power consumption and energy consumption can be calculated, and the effective power consumption and invalid power consumption can be obtained by analyzing the cluster power consumption of different load states, if the job scheduling system is used, the energy consumption can be calculated directly. Control of cluster peak power consumption
Mainly based on three considerations: First, the cluster ups and cooling unit can support IT equipment maximum power consumption is definitely not beyond, so users often need to configure more redundant equipment, and its utilization is very low, if the cluster power consumption limit can be controlled, reduce the number of redundant equipment, reduce the invalid investment. Second, UPS and cooling unit also on the cabinet's power consumption density has the corresponding requirements, design power density is too high will make a significant increase in cost, design density is too low will result in low space utilization, so need to choose a suitable power density. Previously, the power density could only be determined by the rated power of the server, which was almost impossible to achieve in practical applications, and the design would inevitably result in a waste of space and over-investment in power and cooling equipment. Accurate workgroup level, cabinet level and cluster level power cap adjustment can significantly improve device usage. Third, different types of application load characteristics are different, such as high-performance computing is usually high CPU utilization, communication latency between nodes sensitive, and Internet applications are often more concerned about the rapid reading of data, CPU utilization is not very high. Even in the same application, its load will often have a large fluctuation, the next figure on the left is a company's internal mail server load, the right is the load autocorrelation function, you can see that the load has a more obvious periodicity, according to the application of the load characteristics of the power consumption limit adjustment will significantly improve the efficiency of the server.
Historical load analysis and power allocation strategy
The application of server operation will not change frequently, and the real-time data of load and power consumption can be saved to form historical data, and the monitoring system can automatically analyze the characteristics and development trend of load and make corresponding adjustment accordingly. The existing automatic control technology can give a variety of strategies, its details beyond the scope of this article, no longer one by one repeat, but in principle both to ensure that the power distribution of load fluctuations in the fast response, but also avoid too frequent adjustment caused by waste of power resources. A sound power allocation strategy allows administrators to focus on the analysis of load characteristics without consuming a lot of detail for their specific values and the timing of the adjustments. If the job scheduling system is used, the user can introduce a new algorithm to dispatch the job according to the power distribution of the cluster and improve the efficiency of the system.
These two aspects are complementary, the unified management is the fine Management Foundation, the fine management is the unified management realization means.