Traditional RAID 1.0: Select a few hard drives, RAID 5-> to create a lun-> map to the host based on capacity 1. Reconstruction time increases rapidly with the increase of single disk capacity 2. Performance, the read and write of a LUN can only be carried out in one disk group, after the addition of new SSD and other media can not play a role
RAID 1.5 on attack:
Choose a few hard disk---Do RAID 5-> select multiple RAID 5 to make up a pool---cut into equal pieces extend-> choose extend make up the LUN 1. Read-write spans more hard disks, a single LUN can contain multiple raid, or multiple disks can be automatically tiered storage. (Automated tiered Storage management,ats, which is the ability to migrate blocks of data between different disk types and RAID levels, to meet the right balance between performance and space use, and avoid so-called hotspots) 2. A rebuild caused by a disk corruption is performed only in a single raid and does not reduce the rebuild time.
Sturdy RAID 2.0: The following is an example of the Hvs of Huawei, the graphics and text are from the Hvs raid 2.0+, different manufacturers to implement RAID 2.0 specific methods and technical parameters vary, 1, for the system internal physical hard disk, according to the performance of different media, composed of several types of storage pool 2. Divide each hard disk space inside the system into 64MB uppercase logic block (Chunk) 3. Make a logical block group (CKG) from different hard drives by raid 4. Cut the logical Block Group (CKG) into 512KB~64MB (default 4MB, size configurable) logical block (Extent) 5.1-n a finer-grained logical block (Extent) into a volume (Volume)/File on Demand
Previously based on hard disk raid, now based on Chuck to do raid, the advantage is that 1. The reconstruction time is greatly reduced, a chuck is damaged, the size is 64MB, a few seconds to complete; one hard drive is damaged, and refactoring is just a few chuck with actual data, and more of the target hard drive. 2. The smaller the Shard at each level, the less data is migrated and reconstructed, and the refactoring time can be reduced further, but the smaller the Shard per level, the more overhead the management will be and the fewer disks can be managed. When hard disk 1 is damaged, causing CKG0 and CKG1 data corruption, the actual data of 14 and 162 chunk damage, will randomly select the free chunk in the pool to refactor (yellow squares represent no data), randomly selected chunk will be guaranteed to distribute on different hard disks. Randomly select drive 6 of 61st chunk and hard disk 8 of 81st Chunk, the data will be reconstructed from the other member disk to these two chunk.
What is the significance of reducing refactoring time? The storage system pursues the balance of performance and reliability, the storage system under the single Check mechanism (RAID 6 is 2 checksum, RAID 1+0 is special), the data is not guaranteed in the reconstruction process, if the bad disk occurs during the reconstruction, the data is lost and unrecoverable; So for the storage system, The most important sign of its reliability is that the smaller the raid refactoring time, the better. RAID Refactoring time = data volume/hard disk read and write speed, a 2tb,7.2k disk for example, refactoring written at a speed of about 30M, refactoring time 9 hours, business peak time longer. In order to reduce the time of reconstruction and the constant reading and writing speed, it is an important method to reduce the amount of data written. As for IBM XIV, he does not raid, all the hard drives are all cut to 1MB size, using pseudo-random algorithm to retain 2 copies between different nodes (a bit like raid 10), so hard disk recovery time and performance and RAID 2.0 is the same, only capacity utilization of up to 50% |