The Linux kernel Adventures of the MD source code interpretation of 14 RAID5 non-bar readReprint Please specify Source: Http://blog.csdn.net/liumangxiong
If it is read in a non-bar, then at least two bars of reading are involved, it is necessary to read the data separately from the two bars, and then the whole result is returned to the upper layer. Next we will see how to split a complete bio read request into multiple sub-requests to the disk, return from the disk and then regroup the request results back to the upper layer.
4097 logical_sector = bi->bi_sector & ~ ((sector_t) stripe_sectors-1); 4098 last_sector = bi->bi_ Sector + (BI->BI_SIZE>>9); 4099 bi->bi_next = null;4100 bi->bi_phys_segments = 1; /* over-loaded to count active stripes */
The start of the request is calculated first, as the minimum unit of the MD issued to the disk data request is stripe_sectors, so the request is aligned here. The requested start position is logical_sector and the end position is last_sector. The 4100-row multiplexing bi_phys_segments is used to count the number of strips issued, which prevents accidental release from being set to 1 first.
4102 for (; logical_sector < last_sector; Logical_sector + = stripe_sectors) {4103 define_wait (w); 4104 int previous;41054106 retry:4107 previous = 0;4108 prepare_to_wait (&conf->wait_for_overlap, &w, task_uninterruptible); 41344135 new_sector = raid5_compute_sector (conf, logical_sector,4136 previous,4137 &dd_idx, NULL); 4138 pr_debug ("raid456:make_request, Sector%llu logical%llu\n", 4139 (unsigned long long) new_sector,4140 (unsigned long Long) logical_sector) 41414142 sh = get_active_stripe (conf, new_sector, previous,4143 (Bi->bi_rw&rwa_mask), 0);
In this loop, the request is split into multiple bands, each issued a command. There is also a need to be mutually exclusive when working with bands, and no two threads can operate the same stripe at the same time. For example, the synchronization thread is synchronizing this stripe, raid5d is writing this stripe, then it will produce unintended results. 4103 rows, waiting for the queue for stripe access mutex 4108 rows, join the waiting queue 4135 lines, according to the array logical sector to calculate the disk physical offset sector, and calculate the corresponding data disk number and check disk number 4142 lines, according to the physical offset sector of the disk to obtain a stripe
4144 if (SH) {.... 4186 if (Test_bit (stripe_expanding, &sh->state) | | 4187!add_stripe_bio (SH, Bi, DD_IDX, rw)) {4188/* stripe is busy expanding or4189 * Add failed due to overlap. Flush everything4190 * And wait a while4191 */4192 Md_wakeup_thre AD (mddev->thread); 4193 release_stripe (SH); 4194 schedule (); 4195 Goto retry;4196}4197 finish_wait (&conf->wait_for_overlap, &w); 4198 Set_bit (Stripe_handle, &sh->state); 4199 clear_bit (stripe_delayed, &sh->state); 4200 if ((BI->BI_RW & Req_sync) &&4201!test_and_set_bit (stripe_preread_active, &sh-> (state)) 4202 Atomic_inc (&conf->preread_active_stripes); 4203 Release_stripe_plug (MD DEV, SH); 4204} else {4205/* Cannot get stripe for read-ahead, just give-up */4206 cl Ear_bit (Bio_uptodate, &bi->bi_flags); 4207 finish_wait (&conf->wait_for_overlap, &w); 4208 break;4209}4210}
The first time I looked at this code, I couldn't find the focus because I was in a hurry. Like a person in the noisy city grew up, because of the city's appearance is confused completely do not know what the heart really want to pursue life. When the real calm down to see, finally found the most important sentence in 4187 lines, that is, Add_stripe_bio function, from then on stripe no longer lonely, because of the possession of bio, it is ready to join the strip processing process, a vigorous strip of people's way out of this unfolds. A new stripe was formally added to the processing queue (conf->handle_list) after 4198 rows and 4203 rows of Release_stripe_plug. People in the last half of the constant search for the entrance, the rest of the world constantly looking for exports. Here, read stripe found the entrance, then where is the exit? Students who have read LDD must know the answer, for block device drivers that do not use the default request queue, the corresponding Make_request function is the entry, and the exit is Bio_endio. The next step is to move towards this exit. Release_stripe_plug after the first entry is Handle_stripe,handle_stripe call analyse_stripe, in this function set the To_read:
3245 if (Test_bit (R5_wantfill, &dev->flags)) 3246 s->to_fill++;3247 else if (dev->toread) 3248 s->to_read++;
Return to the Handle_stripe function:
3472 if (s.to_read | | s.non_overwrite3473 | | (Conf->level = = 6 && s.to_write && s.failed) 3474 | | (S.syncing && (s.uptodate + s.compute < disks)) 3475 | | s.replacing3476 | | s.expanding) 3477 Handle_stripe_fill (SH, &s, disks);
To_read triggers the Handle_stripe_fill, the function of which is to set the flags that need to be read:
2696 set_bit (r5_locked, &dev->flags); 2697 set_bit (R5_wantread, &dev->flags); 2698 S >locked++;
Then came the ops_run_io and sent the read request to disk. The callback function for the read request is raid5_end_read_request:
1745 if (uptodate) {1746 set_bit (r5_uptodate, &sh->dev[i].flags); .... 1824 rdev_dec_pending (Rdev, Conf->mddev); 1825 clear_bit (r5_locked, &sh->dev[i].flags); 1826 set_bit (Stripe_handle, &sh->state); 1827 release_stripe (SH);
This function does two things, one is to set the R5_uptodate flag, and the other is to call the Release_stripe again to return the stripe back to the handle_stripe processing. Enter the Analyse_stripe function with the r5_uptodate sign:
3231 if (Test_bit (R5_uptodate, &dev->flags) && dev->toread &&3232 !test_bit ( Stripe_biofill_run, &sh->state)) 3233 set_bit (R5_wantfill, &dev->flags); 32343235 * * now Count Some things */3236 if (Test_bit (r5_locked, &dev->flags)) 3237 s->locked++;3238 if (Test_bit (R5 _uptodate, &dev->flags)) 3239 s->uptodate++;3240 if (Test_bit (R5_wantcompute, &dev->flags ) {3241 s->compute++;3242 bug_on (S->compute > 2); 3243 }32443245 if (Test_bit (r5_ Wantfill, &dev->flags)) 3246 s->to_fill++;
Set the R5_wantfill flag in line 3255, set the To_fill in line 3246, and return to Handle_stripe again:
3426 if (S.to_fill &&!test_bit (Stripe_biofill_run, &sh->state)) {3427 set_bit (stripe_op_ Biofill, &s.ops_request); 3428 set_bit (Stripe_biofill_run, &sh->state); 3429 }
Strip state set Stripe_op_biofill, as long as the set of S.ops_request, you must immediately know that the domain corresponding to the processing function is raid_run_ops, the actual operation in the __raid_run_ops:
1378 if (Test_bit (Stripe_op_biofill, &ops_request)) {1379 ops_run_biofill (SH); 1380 overlap_clear+ +;1381 }
The corresponding handler function is Ops_run_biofill:
812static void Ops_run_biofill (struct stripe_head *sh) 813{814 struct Dma_async_tx_descriptor *tx = null;815 struct Async_submit_ctl submit;816 int i;817818 pr_debug ("%s:stripe%llu\n", __func__,819 (unsigned long Long) Sh->sector); 820821 for (i = sh->disks; i--;) {822 struct R5dev *dev = &sh->dev[i];823 if (test_bit (R5_wantfill, &dev->flags)) {824 struct Bio *rbi;825 Spin_lock_irq (&sh->stripe_lock); 826 Dev->read = RBI = dev->toread;827 Dev->toread = null;828 Spin_unlock_irq (&sh->stripe_lock); 829 while (RBI && rbi->bi_sector <830 Dev->sector + stripe_sectors) {831 tx = Async_copy_data (0, RBI, dev->page,832 Dev->sector, TX); 833 RBI = R5_next_bio (RBI, Dev->sector); 834}835}836}837838 Atomic_inc (&sh->count); 839 Init_async_submit (&submit, Async_tx_ack, TX, Ops_complete_biofill, SH, NULL ); 840 Async_trigger_callback (&submit); 841}
Finally see the truth, can not help feeling the code is wrapped in a layer after layer, like a mysterious birthday gift to open a layer and layer of packaging, and like the Old Alley Alley through a together to find the liquor store son. But no matter what, the code is not reserved for you, sincere. And the more complex code on the more amorous feelings of all kinds, graceful, the premise is that you have to know how to walk into her heart to understand, and so on when you will be simply astounding, so that the thrill of conquest can not be forgotten for a long time. After conquering the code of one style after another, your quest is no longer confined to the body, but to the spiritual height, like a European architect to design the cathedral, and then spend a more than 600 years to build the Gothic Cologne Cathedral, this is called art. Well, at that time you and I are no longer, but that spirit is always you and I want to pursue the realm. 823 lines, we have just finished reading the disk, this will read the data from the buffer to copy to the Dev->page, and at this time Dev->toread also transferred to the Dev->read. This first constructs the DMA descriptor, 839 and 840 submits the request to the DMA, and then calls back to the 839 incoming parameter Ops_complete_biofill after the request is completed:
769static void Ops_complete_biofill (void *stripe_head_ref) 770{771 struct Stripe_head *sh = stripe_head_ref;772 Str UCT bio *return_bi = null;773 int i;774775 pr_debug ("%s:stripe%llu\n", __func__,776 (unsigned long long ) sh->sector); 777778/* Clear completed Biofills */779 for (i = sh->disks; i--;) {780 struct R5dev *dev = &sh->dev[i];781782 */Acknowledge completion of a Biofill operation */78 3/* And check if we need to reply to a read request,784 * New R5_wantfill Requests is held off until78 5 *! stripe_biofill_run786 */787 if (Test_and_clear_bit (R5_wantfill, &dev->flags)) {788 struct Bio *rbi, *rbi2;789790 bug_on (!dev->read); 791 RBI = dev->read;792 D Ev->read = null;793 while (RBI && rbi->bi_sector <794 Dev->sector + S tripe_sectors) {795 RBI2 = R5_next_bio (RBI, dev->sector); 796 if (!raid5_dec_bi_active_stripes (RBI)) {797 Rbi->bi_next = return_bi;798 Return_bi = rbi;799}800 RBI = rbi2;801}802}803}804 clear_bit (Stripe_biofill_run, &sh->state); 8058 Return_io (RETURN_BI); 807808 set_bit (Stripe_handle, &sh->state); 809 release_stripe (SH); 810}
If you have acquired the Yimushihang fire eye, you must have seen the 806 rows of return_io, yes, this is the exit I mentioned earlier:
177static void Return_io (struct bio *return_bi) 178{179 struct Bio *bi = return_bi;180 while (BI) {181182 Return_bi = bi->bi_next;183 bi->bi_next = null;184 bi->bi_size = 0;185 bio_endio (bi, 0); 186 Bi = return_bi;187 }188}
Finally see Bio_endio, Happy Bar to celebrate a drink. Is the party enough? Next there are two study questions: 1) return_bi Why not a bio, but a bi_next? 2) Since Return_io is over, why do 808/809 rows have to be re-added to the processing chain list?Reprint Please specify Source: Http://blog.csdn.net/liumangxiong