DBA, is your RAID 5 array secure?

Source: Internet
Author: User

This article mainly organizes Forum content. Forum address: http://bbs.et8.net/bbs/showthread.php? T = 1026112)

Many people have encountered that RAID5 is suspended on the server. After a disk is dropped, the second disk is immediately suspended.

Reference: RAID 5 also ensures data security based on the data check bit, but it does not store the data check bit on a separate hard disk, but stores the check bit of the Data Segment on each hard disk. In this way, damaged data can be rebuilt based on the check bit on other hard disks. The hard disk usage is n-1. If two disks are attached, the data will be used up. Theoretically, the probability of failure of two hard disks at the same time is very low, but why? Reference: from a mathematical point of view, the average MTBF of each disk is about 0.5 million to 1.5 million hours (that is, every 50 to hours ~ A hard disk was damaged on July 15, 150 ). In practice, it is often impossible to achieve this kind of ideal situation. In most heat dissipation and mechanical conditions, it will cause a significant reduction in the working time of the hard disk. Considering the service life of each disk, any disk in the array may be faulty. From a statistical perspective, the failure rate of N disks in the array is N times higher than that of a single disk. Based on the above factors, if the number of disks in the array is reasonable and the average MTBF of these disks is short, the expected service life of the disk array, disk failures are likely to occur (for example, failures occur every several months or every several years ). What is the probability that two disks will be damaged at the same time )? If the MTBF of the RAID 5 array is equivalent to MTBF ^ 2, the probability is that it occurs once every 10 ^ 15 hours (that is, more than 10 thousand years), regardless of the working conditions, the probability of such a situation is very low. From the mathematical theory point of view, there is such a probability, but in reality we do not need to consider this issue. However, sometimes two disks are damaged at the same time. We cannot ignore this possibility completely. The cause of the two disks being damaged at the same time is basically irrelevant to MTBF. In this case, we should first introduce a concept that is not commonly used by ordinary people: BER hard disk Error Rate (Bit Error Rate ), it is a very important parameter to describe the performance of a hard disk and a parameter to measure the reliability of a hard disk error. This parameter indicates the probability that you encounter unrecoverable read errors when writing data to the hard disk. From a statistical point of view, it is also relatively rare. Generally, it refers to the number of digits read before a read error occurs. As the hard disk capacity increases, the rate of reading data from the drive increases, while the hard disk capacity increases, the ratio of bit error rates keeps increasing. A 1 TB Drive requires more reading of the entire drive, which is more likely to cause errors during RAID reconstruction than 300 GB drives.
How likely is this error? Or, how much GB of data is written before a 1 byte read error occurs? Read this article:
Http://lenciel.cn/docs/scsi-sata-reliability/

For enterprise-level, server-level, and data center-level hard disks of different types, SCSI/optical fiber is used, and the commercial and civil level is IDE; the corresponding type is SAS/SATA; their average MRBF fault-free time) is close, but the low-cost SATA hard drive of BER is much higher than the low-error BER of expensive SCSI hard drive.
That is to say, when a sector cannot be read, SATA is much more serious than SCSI. The specific difference lies in the firmware: When a bad track cannot be read or written, the home hard disk will spend more than one minute trying to correct the error, if it cannot be corrected, it will be replaced by the backup sector directly. This time exceeds the tolerable limit of the array controller, so the disk is directly dropped in this case. enterprise-level disks will put this work in the background, it does not need to be paused for about 1 minute, thus the array operation is not affected. There is no difference in the BER hard drive error rate. According to the calculation in the article, a 1 TB hard disk, usually you cannot read all the sector, the probability of reaching 56%, so you use a cheap large capacity SATA disk, in the case of a hard disk failure, the RAID reconstruction is expected to be impossible. If you use a 1 tb sata hard disk for RAID 5, when you encounter a hard disk failure, the remaining two or more hard disks will be combined with at least three.) iron will surely encounter a hard disk read error, the reconstruction fails.
Therefore, in the past, when a small hard disk was used as RAID 5, it was rare to have two disks mounted at the same time. Now the hard disk is too large, and the probability of a problem is also increasing. For users running RAID, reading the entire hard disk often occurs. Even if the system is harmonious enough and you do not know how to report the bad channels that appear in files you never read, you just skip the report step: it will still find all the bad channels, 56% is coming. There is also the so-called dedicated enterprise-level SATA monitoring. Its principle is to do some work on the firmware, so that the hard disk can skip this even if it encounters a data read error, if you do not try to read the standard hard disk again, the system will automatically re-read the disk when a sector CRC error occurs until the correct data is read ). This is a matter of course for monitoring data. Most monitoring hard disks are constantly written, but seldom need to be read.) unless a problem occurs, you need to reproduce the image. Hot Spare cannot solve undetected errors in the existing RAID 5 array disk. Hot Spare can only replace an offline disk for Raid reconstruction when it is offline. If there are errors in other disks, the reconstruction may fail. The solution is to perform regular or other error checks when the array is healthy. Generally, the hardware array card is inserted into the motherboard PCI/PCIX/PCIE/or the RAID 5 integrated into the motherboard. There is no data inspection scrub function at all. Enterprise-level data storage, such as IBM DS3000/4000/5000, DELL md3000 .... etc), but you cannot see the check result. You can see at most the CRC failure of a hard disk in the log, and then jump out of the red light. The array cabinet prompts you to change the hard disk. You don't want to know whether the hard disk is completely suspended, whether it is read errors or bad channels... In short, two eyes are black. The raid on ZFS has the data inspection scrub function.) Summary: The probability of dropping two disks at a time in RAID 5 is 1. The larger the capacity of the hard disk, the higher the probability of using RAID 5; for example, if you use a GB hard disk for RAID 5, it is safer than 1 Tb. 2. If you use a hard disk with more disks for RAID 5, the higher the probability of the BER sector. For example, if you use three disks for RAID 5, it is safer than RAID 5 with 6 disks. 3. RAID 5 with a cheaper hard disk has a higher probability of slice; for example, a SCSI/FC/SAS disk is safer than a RAID 5 with IDE/SATA disks. 4. The more data is stored in RAID 5, the more full the data is, the higher the probability of a sector; for example, RAID5 with GB of data is safer than RAID5 with 1 TB of data. When a REBUID is used, the read-only sector of the data is used. Some cards will read the entire disk no matter whether they are or) RAID1/RAID10 only has one disk to be re-built. Compared with all raid5 disks, the fault probability is reduced. RAID1 does not need to force launch operations for a group of disk faults, Because the data still exists, data can be read without RAID groups, even if it is switched to another machine without a RAID card, the data can still be read; And if RAID5 cannot force the second hard disk to go online, you cannot read anything at all. For databases: raid 1 + 0 is the best thing! Backup is essential. When RAID is downgraded, we recommend that you back up the data first if the capacity of important data is not large. Of course, this backup should be from a different machine, cannot be backed up to the currently degraded RAID. If another OFFLINE hard disk occurs during REBUILD, resulting in RAID volume OFFLINE, RAID cannot be rebuilt. For example, if the OFFLINE hard disk is determined, data can be restored by force launch (some controllers do not have the option,)

 

This article from the "Technical Achievement dream" blog, please be sure to keep this source http://weipengfei.blog.51cto.com/1511707/1006565

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.