Data center air-conditioning systems generally do not fail in the depths of winter-Murphy's law once again proves that such failures often occur in hot summers. No matter when the cooling system is interrupted, the data center will experience a warming up process that may threaten the server and other devices. You are walking on a crisis path if the cooling capacity is critical enough to shut down one of the single room Air conditioning units (CRAC) for maintenance. This chapter describes some of the best practices for maintaining CRAC. Before we go into this article, please note that all types of room air-conditioning are called Crac, but from a professional point of view, the cooling water unit is the room Air processor (Crah). &http://www.aliyun.com/zixun/aggregation/37954.html ">nbsp
(a) Do not leave a regret for the cooling module
Cooling modules have always been a critical part of modern data centers, and how to maintain CRAC units for cooling is more important. Huge investments in cooling equipment and the computer-controlled resources needed to maintain the facilities should be ensured and prevented from failure, but in fact they are not. We have recently tried to divide each item into "Fit Size" in pursuit of energy efficiency, which will make each device clearer and reduce the failure rate. Nevertheless, the increase in equipment has its own boundaries, and there are concerns about preventative maintenance that needs to be shut down. Worse, maintenance contracts are often considered too expensive, and even a few years later the sum can be used to replace a new set of CRAC units. In addition, the CRAC service is usually performed by facility-related personnel, without checking the list to indicate which content needs to be confirmed, adjusted or replaced, and without a detection cycle. In short, unlike a relatively simple maintenance phone, if there is no perfect preventative measures or no maintenance at all, cooling failure may become the primary maintenance shutdown fault source.
(ii) Beware of the setting of deliberate cooling shutdown
Let us first correct the excessive attention to the temperature rise in the short term. ASHRAETC9.9 has expanded the temperature limit in 2008 years to make sure that the device works in 27 degrees Celsius (80.6 degrees Fahrenheit) and can continue working for several days at 32 degrees Celsius (89.6 degrees Fahrenheit) without affecting the equipment or warranty. These parameters have been accepted by all major hardware manufacturers. Still, large department data centers still set the cooling down to temperatures lower than the actual demand. In fact, even if the cooling unit has reached a critical or no redundant device, the stand-alone CRAC unit can still be shut down for several hours for perfect preventative maintenance, which does not cause the temperature of the data center to exceed the limit. Shutting down the cooling system a few hours a day will not drastically change the temperature of the entire data center, which is far better than losing the entire CRAC unit in the heat of the year, and keeping the engine room running without air-conditioning for days or weeks. ASHRAE also defines the "temperature rise ratio" limit, which we'll introduce in other chapters. If the maintenance shutdown makes the temperature rise faster than the Ashare recommended value, this suggests that you need to consider a professional cooling assessment.
When we talk about running parameters, we should not forget about the most overlooked item in cooling Maintenance-the set point. All air conditioners should be checked to make sure they maintain the same temperature and humidity levels, but it would be better if all the devices could display the readings directly. If different sets of units, then air-conditioning may compete with each other, consuming a lot of energy actually reduces the refrigeration effect. Adjusting sensor placement position based on experimental results can also help achieve unified controlEffect. A commonly overlooked fact is that the location of the factory is not necessarily the best. Over time, temperature or humidity can also change as a result of sensor failures or device installation patterns that prevent the unit from effectively maintaining a good environment. You may consider adding a set point according to the ashare instruction manual, but you need to ensure that the server inlet temperature can be adjusted according to the ashare limit to ensure that it does not exceed the maximum inlet temperature limit. This will improve cooling efficiency and reduce the loss of air-conditioning equipment.
(iii) The most important task of CRAC unit maintenance is to replace the filter by including what aspects
maintains the CRAC unit. Dirty filters increase the load on the motor and reduce the cooling capacity. If the filter is found to be dirtier than expected when it is replaced, then the reason for finding the problem from the source. Dust particles also accumulate in computer hardware filters or heat sinks to increase internal temperature. The most common source of pollution is the storage of items in the data center or the unpacking of boxes, which are absolutely not allowed to operate in the data center.
(iv) Mechanical equipment Maintenance
The mechanical equipment to be maintained depends on the type of Crac unit selected, but if a belt is involved, their tightness needs to be adjusted to the appropriate level. Belt stretching length and the factory parameters need to be maintained. Setting too tight will cause the belt and bearing to bear an unnecessary burden, while loose setting can lead to slippage and lower performance. Auto Tensioning Belt has been available for more than 5 years, but it may be a good rule of thumb to replace other belts each year. In any case, the belt should be replaced according to the time limit suggested by the producers, and they seem to work well in time. It is equally important to check the tightness of the motor bracket and pulley group. Of course to do anything, add some lubricant is always good, but need to be careful not to cause leakage or splash caused by excessive addition. Clean mechanical systems usually run more stably and lastingly. Problems that are often overlooked by
include abnormal sounds. Operational personnel should pay attention to sound changes, which may be a warning of certain issues, although such changes may be intermittent or slow to continue, but should be taken into consideration, the formation of habits. Maintenance techniques may not be able to detect such problems, but they cannot be ignored, and they are often a precursor to big trouble.
(v) The importance of refrigeration levels, the electrical test
Direct expansion (DX) unit of the refrigeration level is inspected at least once a year. Lower cooling levels may mean leaks that need to be discovered and repaired immediately. The proportional valves of water-cooled air conditioning (Crah) units need to be regularly tested to ensure control and operation.
Ensure that the condensate drain is not blocked and that the condensate pump is working as well as usual. Depending on the actual situation, condensation may not last for months, which means the pump is idle and the system is in a state of lack of water. This is when water is introduced to keep the system running.
Humidity also needs to be checked frequently. Steam irrigation may need to be replaced, also may be infrared humidifier has accumulated a certain thickness of the scale needs to be cleaned. If no regular replacement of water filter, ultrasonic humidifier may also be blocked. Attention should be paid to the humidifier service cycle and water quality conditions have a great relationship. Water quality analysis can help determine the cycle and frequency of part replacement. Another often overlooked aspect of
is electrical testing. Just because the CRAC unit is running doesn't mean everything is fine. The current status of the different components (amperes) is continuously recorded. The rotating number of the motor and the current reading should also be recorded. Changing current trends or motor deceleration are likely to mean a deeper problem. You must first check the compactness of the power connection before reading the energy data. Clamp-type measurer may move cables, or make connections such as fire-sensitive wires loose, causing power outages throughout the data center. The connection status of the AC power cable should be part of the annual thermal infrared scanning system.
(vi) Setting aside time for external maintenance
Maintenance of external components of the cooling unit (chillers, pumps, cooling towers and valves) is a major project, the relevant content has been beyond the scope of this film, and it engineers know little about this knowledge. When these devices are shut down, however, they need to be negotiated with the IT staff, especially in the absence of redundant facilities, because the associated activity may affect the entire datacenter's cooling plan. Facilities staff often attach great importance to the maintenance requirements of these large components, but usually ignore manual valve operations. Closures and by-pass valves may have been unused for many years and are usually set outdoors. Valve failure is usually due to corrosion, and may even make the valve not normal operation. They are externally cleaned and, if necessary, externally protected and scheduled for periodic maintenance to ensure that they are used properly when needed. If necessary, the replacement job can be scheduled at a time when the data center is least affected.
In short, maintenance contracts provided by vendors are well worth considering, and they can provide monthly, quarterly, semi-annual and annual maintenance services. For almost all data centers, the maintenance response time is 8 hours, and the coverage cycle is 5 days long enough (that is, 8/5 maintenance levels). The actual impact of the temperature rise in a few days will not be too great, which can save you the extra cost of choosing a 24/7 maintenance level. If maintenance services are performed in-house or by a third party, they should be operated strictly according to the manufacturer's maintenance procedures. Whoever is responsible, it operational maintenance is complete and complete by keeping track of the maintenance calls, backing up the related documents, documenting the problems identified and solutions, documenting the work done by the preventative maintenance, and ensuring that the results are consistent with expectations.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.