On the afternoon of March 30, 2015, colleagues came out from the computer room and told me that the server disk is bright yellow light.
I went to the computer room and saw the problematic server. The Dell r720 front panel flashes:Fault detected on drive 0 in disk drive Bay 1. The Chinese meaning is approximately: in Drive Bay 1, Disk 0 detects a failure.
Disk 0 LED is really bright yellow light, look like the situation is not good. This server runs the company's most important ERP database, handling faults I need to be extra careful. I decided to get rid of the problem tonight, and I don't want the problem to drag on.
One of the most critical steps to take before processing is to back up the data, and of course the backup is a different storage.
After the data is backed up, it's 6:20. Notify the company colleague that the server is shut down by shutting down the server business. I did not choose to operate online because I feared that the online operation might have a disk bottleneck or that the disk read and write pressure was too large to have a new failure.
Reboot the server into the array BIOS and discover that the former engineer has raid10 the server with a total of 6 disks. Shut down the server, take out the problematic disk, and prepare to replace the hot spare in the disk cabinet with the old file server (deactivated).
Can take out the problematic disk a look surprised, the disk is the SATA type, not the server disk SAS. Check the r720. The other disks are SATA. No wonder the server is only a year old more disk problems, originally not professional.
It seems that the old server SAS disk is not used, to raid rebuild. The type of disk must be the same.
Colleague looked for a long while did not find the same type of SATA 2t disk, I told him to monitor the host disk to remove one to use. This is really not a good doctrine, but there is no good way.
After inserting the new disk into the system, power it back in. From the disk LEDs, the disk does not have a raid rebuild.
I restarted the server, entered the array BIOS, and manually configured the newly inserted disk as a hot spare. At this point the system automatically rebuild,raid the disk to rebuild. I stopped all host service status for rebuild, starting from about 8 o'clock to 11 3 hours to complete the process.
In fact, online reconstruction is allowed, but I always cautious. There is a saying: be careful to make the perpetual boat.
If you take some time to recover your data business, you will get a colleague to say something. If you spend less time making the wrong data, you will get the boss to have tea.
This article is from the "Anthony Big Group" blog, please be sure to keep this source http://52czy.blog.51cto.com/3704825/1626581
Experience Dell R720 Server bad disk RAID rebuild