This document explains how stupid these data center errors are to BKJIA. In this issue, we have found corresponding preventive measures for the above errors. One attack and one guard are also an external power. Let's take a look at how to resolve these errors.
Data center shutdown is usually caused by device faults or a chain reaction to emergencies. However, human error is the main cause of data center shutdown. According to a study launched by the Uptime Association, about 70% of the data center problems are caused by human errors. We can see how terrible people are to the data center.
How can this problem be solved? "There is no doubt that human errors that cause data center shutdown can be avoided by some simple steps," said Ahmad moshsiri, Power Technology Support Director ". The following are the best practices for avoiding faults caused by human errors in the data center.
1. Block the emergency close button
Emergency shutdown Emergency Power Off, EPO) buttons are generally located near the door of the data center, usually, these buttons do not have a lid or are marked, in an emergency, it is easy to mistakenly close the power supply of the entire data center. Add a label to the EPO button or add a lid to avoid the button being accidentally pressed.
2. perform operations according to the method specified in the document
One-step operations based on the documents given by the vendor can reduce or eliminate misoperations during maintenance tasks. In addition, the backup plan should also include emergency response measures.
3. Correct component labels
To operate the power supply system correctly and securely, all the switches and switches must be correctly identified, and a single-line circuit diagram of the data center must be provided to ensure the correct operation sequence. Before each operation, you should carefully check whether the device tags are correct.
4. consistent operation methods
Sometimes, data center administrators neglect their duties, do not follow standard operating procedures, forget or skip some steps, or accidentally close a device based on their memory, therefore, it is essential to keep all operation instruction documents updated and operate in strict accordance with the instructions.
5. Continuously Cultivate Talents
Ensure that everyone has independent access to the data center, including IT, emergency, security, and facility maintenance personnel, so that they can learn the basic equipment-related knowledge to avoid power off by mistake.
6. Secure Access Policy
There is a high risk of organizational security without a data center login policy. In particular, when external visitors enter the data center, they need to be accompanied by someone to let the Data Center Administrator know who is coming and when to leave.
7. Enforce food/beverage policies
Short circuits caused by liquids are the biggest risk of key computer components. It is best to post a notice at the door to prohibit any food or beverage from entering the data center and establish a monitoring mechanism. In case of any violation, all are severely punished in accordance with the regulations.
8. Avoid pollutants
Poor indoor air quality may cause unnecessary dust particles and fragments to enter servers and other IT infrastructure, most problems can be solved by requiring people to enter the data center to wear anti-static shoes or placing a mat outside the data center door. In addition, the equipment should be split out of the data center door, if a box is moved together into a data center, the chance of attaching fiber on the box to the rack and other IT infrastructure is greatly increased.
Original article: How to Prevent Downtime Due to Human Error Author: Rich Miller
This article is an original BKJIA article. You must mark the source and author for reprinting]