Analysis of read/write processes in the MD module-4

Last Update:2014-06-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MD is different in processing read/write errors. When data writing errors occur, the processing is relatively simple. When an error occurs during reading, it is troublesome. it calculates the data that cannot be read and writes it back to the disk. First, let's take a look at how to handle write errors. 1. an error occurs when writing data. if the data is written, it is... information & nb

1. an error occurred while writing data.

If an error occurs in writing, the BIO_UPTODATE bit of bio in the callback function raid5_end_write_request () is invalid. call the md_error function to set the rdev to Faulty, clear the In_sync flag, and degreded ++. Wake up the raid5 daemon. if there is a spare disk, perform a recovery. This process will be discussed later. Set the STRIPE_HANDLE bit of the strip to continue processing the strip.

In the handle_stripe5 function, the number of invalid disks is counted as failed. For RAID 5, a disk is allowed to expire. If failed> 1, the array will become invalid. In handle_stripe5, the corresponding processing ----- that is, if (failed> 1 & to_read + to_write + written) is satisfied, and all commands in the strip are returned as failures.

If only one disk is invalid and there are non-full write requests on the expired rdev, data on other disks must be read. Why? In fact, just think about it. When writing data, we need to calculate and verify the data on the disk. to ensure the correctness of the data on the disk, we must know the original data in the buffer zone for non-full write on the invalid disk, then, update part of the data to the buffer. In this way, the data in the buffer zone that fails during data writing is correct. The data on the invalid disk needs to be calculated based on the data on other disks. Therefore, the data on other disks must be read first.

If it is not a non-full block write, we do not need to pre-read data from other non-failure disks. At this time, it will go to the process of determining whether to do rmw or rcw. The rmw display is not practical in the case of a faulty disk. Because rmw reads data from the disk with write requests, and the data on the invalid disk is calculated by pre-reading data from other disks, therefore, set the rmw value to 2 * disks. The rcw value is also 2 * disks. The purpose is to select rcw for data writing.

After that, the data on the invalid disk will be calculated based on compute_block () for a non-full block write, and the data on the full block write is read based on the rcw method. at this time, the data is ready, yes

If (locked = 0 & (rcw = 0 | rmw = 0 )&&

! Test_bit (STRIPE_BIT_DELAY, & sh-> state ))

Then compute_parity5 calculates the disk information and writes the data to the disk. There may be a question: how can I write the invalid disk data? At the end of handle_stripe, the following judgment will be made:

Rcu_read_lock ();

Rdev = rcu_dereference (conf-> disks [I]. rdev );

If (rdev & test_bit (Faulty, & rdev-> flags ))

Rdev = NULL;

If (rdev)

Atomic_inc (& rdev-> nr_pending );

Rcu_read_unlock ();

This rcu lock is very important and someone interested can study it. This code sets the rdev pointer value based on the rdev status. if it is NULL, no specific commands will be issued to the physical disk. In this way, the processing is completed when the data writing error occurs. let's take a look at the time when the read error occurs.

2. an error occurred while reading data.

Like a write error, a read error is first reflected in the raid5_end_read_request () function, but the command is retried. Some checks will be performed before the retry command. for example, if the array is already in the degraded state, we have not retried the command, and the array is broken. For example, if too many reading errors occur on the device, no retry is performed. If no retry is performed, md_error is called. Otherwise, set the rdev status to R5_ReadError and reprocess the strip.

In the handle_stripe5 function, if there is a read request on the invalid disk, you still need to read the data on other disks to calculate the data on the invalid disk. When data on other disks is read, compute_block () is called to calculate the data on the invalid disk.

If (failed = 1 &&! Conf-> mddev-> ro &&

Test_bit (R5_ReadError, & sh-> dev [failed_num]. flags)

&&! Test_bit (R5_LOCKED, & sh-> dev [failed_num]. flags)

& Test_bit (R5_UPTODATE, & sh-> dev [failed_num]. flags)

)

Criterion: Set the rdev status bit R5_ReWrite to valid, and re-write the data in the failed disk back to the disk.

If the rewrite operation is successful, the data is successfully written back to the disk. Otherwise, the data is processed if the write request fails.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of read/write processes in the MD module-4

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis of read/write processes in the MD module-4

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support