QoS mechanism in Linux-DM-ioband (3)

Source: Internet
Author: User

This article explains the ioband mechanism.

The ioband principle is simple: the ioband device sets many groups, each of which has its own weight and threshold value. The ioband driver controls the QoS of the IO requests of these groups. Ioband devices are controlled based on tokens and different tokens are allocated based on group weights. The policy also includes request-based and sector-based Token control.

DM-ioband involves several important data structures:

Struct ioband_device: Indicates an ioband block device under/dev/mapper/. There are several ioband_groups and at least one default group.

Struct ioband_group: Represents a attached group on the ioband device. Each group has different weights and policies. Two bio lists on ioband_group: c_prio_bios and c_blocked_bios. The former indicates struct bio with a higher priority.

Ioband_device-> g_issued [blk_rw_async], ioband_device-> g_issued [blk_rw_sync]Indicates the number of bio of all blocked and issued of device.

Ioband_group-> c_issued [blk_rw_sync]Indicates the number of bio of all blocked in the group.

Static void suspend_ioband_device (struct ioband_device *, unsigned long, INT): Set_device_suincluded sets the dev_suincluded tag, and set_group_down and set_group_need_up set the iog_going_down and iog_need_up labels. Then, wake_up_all will wake up all the processes waiting on ioband_device-> g_waitq and ioband_group-> c_waitq. For bio that has been mapped, call queue_delayed_work
+ Flush_workqueue is used to process these bio data through the work queue. Finally, wait_event_lock_irq is called. Wait until all bio requests on ioband_device are flushed successfully. BTW, The wait_event_lock_irq implementation here is very similar to the condition in pthread.

Static void resume_ioband_device (struct ioband_device * DP): This function clears all dev_suincluded, iog_going_down, and iog_need_up labels, and wakes up all functions waiting on ioband_device-> g_waitq_suspend. This ioband_device-> g_waitq_suspend can be seen only in ioband_map, because once ioband_device is suspend, all bio will be hang here.

Static void ioband_group_stop_all (struct ioband_group * head, int suspend): Set the iog_suincluded and iog_going_down flag for all groups. Use g_ioband_wq to flush bio of all working queues.

Static void ioband_group_resume_all (struct ioband_group * head): Restore the above flag

In the device mapper architecture, ioband is similar to struct target_type such as linear, stripped, and snapshot. The definition is as follows:

Static struct target_type ioband_target = {
. Name = "ioband ",
. Module = this_module,
. Version = {1, 14, 0 },
. CTR = ioband_ctr,
. DTR = ioband_dtr,
. Map = ioband_map,
. End_io
= Ioband_end_io,
. Presuspend = ioband_presuspend,
. Resume
= Ioband_resume,
. Status
= Ioband_status,
. Message = ioband_message,
. Merge = ioband_merge,
. Iterate_devices = ioband_iterate_devices,
};

Static int ioband_ctr (struct dm_target * ti, unsigned argc, char ** argv)

Ioband_ctr first calls alloc_ioband_device to generate an ioband Device of ioband_device. Alloc_ioband_device first calls create_workqueue ("kioband") to create a workqueue_struct member g_ioband_wq. Then initialize a series of ioband_device member variables, and finally return the newly created and initialized ioband_device structure pointer.

Static void ioband_dtr (struct dm_target * ti)

Call ioband_group_stop_all to stop all group requests on ioband (set the iog_going_down and iog_susponded flag), call Cancel to cancel the delayed work_struct, and call destroy to cancel all groups on the destroy ioband device. It can be seen that the Group on the ioband device is stored in the data structure of the red/black tree, rather than the btree used by the device mapper.

Static int ioband_map (struct dm_target * ti, struct bio * bio, Union map_info * map_context)

Note that ioband_map treats synchronous and asynchronous requests separately, for example, g_issued [2], g_blocked [2], g_waitq [2], and c_waitq [2] In the ioband_group structure. it is used to differentiate sync and async Request control.

Ioband_group is obtained through dm_target-> private, while ioband_device can be obtained through ioband_group-> c_banddev. The subsequent steps are as follows:

  1. If the ioband_device is in the suspend state, call wait_event_lock_irq to wait for its restoration.
  2. Call ioband_group_get to find the corresponding ioband_group through bio
  3. The prevent_burst_bios function is very interesting. In my understanding, if the current kernel thread is executing (is_urgent_bio seems to be a simple implementation, the author believes that in the future, the Bio structure should have a control to determine whether it is urgent bio), and call device_should_block to determine whether the current device is blocked. The judgment is based on the io_limit parameter: if the synchronous request exceeds io_limit, all the synchronous requests on the device are blocked, and the asynchronous requests are processed and synchronized. If the synchronous request is not a kernel thread, group_should_block is called to determine whether the current group is blocked. Different policies have different methods to determine whether the group should be blocked: for weight-based judgment, is_queue_full is finally called, range_bw_queue_full is called for bandwidth-based judgment. These two functions will be further studied later.
  4. If should_pushback_bio returns true, the Bio will be replaced into the queue, and dm_mapio_requeue will be returned.
  5. Next, check ioband_group-> c_blocked [2] to determine whether the request is blocked. Call room_for_bio_sync to check whether io_limit is full. If both are false, bio can be submitted, otherwise, call hold_bio to suspend the bio. The core function of hold_bio is to call ioband_device-> g_hold_bio. For ioband_device-> g_hold_bio, its function pointer points to ioband_hold_bio. This function only puts bio into the ioband_group-> c_blocked_bios queue. (The author thinks that c_blocked_bios should have two different synchronous or asynchronous requests)
  6. If bio can be submitted, ioband_device-> g_can_submit will be called. Here, g_can_submit uses different judgment methods based on different policies. If it is based on weight policy, g_can_submit will call is_token_left, if it is a bandwidth-based policy, g_can_submit will call has_right_to_issue. The two functions will be further studied later.
  7. If g_can_submit returns false, it indicates that bio cannot be submitted. At this time, it will still go to hold_bio. Here is a queue_delayed_work kernel call, which will delay one jiffies and start the work queue ioband_device-> g_ioband_wq. This task force column will call ioband_device-> g_conductor-> Work. func (work_struct *)
  8. If bio can be submitted, prepare_to_issue (struct ioband_group *, struct bio *) is called. This function first adds 1 to the ioband_device-> g_issued counter, and then calls ioband_device-> g_prepare_bio, this function is also a policy-related function. prepare_token is called in weight-based policy, range_bw_prepare_token is called in bandwidth-based policy, and iosize_prepare_token is called in weight-iosize-based policy.

Static int ioband_end_io (struct dm_target * ti, struct bio * bio, int error, Union map_info * map_context)

Call the consumer to determine whether the ioband_group has already been suspend. If suspend has already been called, the consumer is returned and put into the queue again. Otherwise, if blocked bio exists, the work queue ioband_device-> g_ioband_wq is started, wake up all wait programs on g_waitq_flush

Static void ioband_conducting CT (struct work_struct * Work)

The ioband_conduct function is a method called by the kernel working queue latency processing. The input parameter Pointer Points to the ioband_device-> g_conductor.work structure. You can use this struct work_struct * to obtain struct ioband_device. The procedure is as follows:

  1. Call release_urgent_bios to put all ioband_device-> g_urgent_bios into the issue_list list.
  2. If the ioband_device has a blocked bio request, select an ioband_group according to certain policies. The ioband_goup must have a blocked bio, and the io_limit is not full. Based on this ioband_group, release_bios and release_bios call release_prio_bios and release_norm_bios respectively. The objective is to put the blocked bio into the issue_list list.
  3. Release_prio_bios operation ioband_group-> bio in c_prio_bios (if the current group cannot submit bio, for example, if the token is used up, return r_block directly), call make_issue_list for each bio and put it in the issue_list or pus, if the c_blocked value of the group is 0, you can clearly identify the block flag of the group: iog_bio_blocked_sync/iog_bio_blocked_async, and wake up the program waiting on ioband_group-> c_waitq [2. Finally, call prepare_to_issue.
  4. Release_norm_bios operation ioband_group-> bio in c_blocked_bios. The number of bio in this class is nr_blocked_group (ioband_group *)-ioband_group-> c_prio_blocked. The remaining code is exactly the same as that in
  5. If the release_bios returns r_yield, it indicates that this group has used all the tokens, and the priority of the submit bio must be given. At this time, queue_delayed_work will be called again, waiting for the next processing.
  6. To re-submit bio requests from all blocks on the device, first clear the blocking sign dev_bio_blocked_sync/dev_bio_blocked_async on the ioband_device, and wake up all the code waiting on wait_queue_head_t ioband_device-> g_waitq [2 ].
  7. If ioband_device still has block bio at this time, and issue_list is still empty after the above Code, it is basically because all groups have exhausted the token and re-added it to the work queue to wait for the next execution.
  8. Finally, for all bio in issue_list, call the general method generic_make_request to the underlying block device for execution. For all bio in pushback_list, call ioband_end_io to end the bio request (in most cases, an EIO error is returned)

---------------------------------------------------- Gorgeous split line ---------------------------------------------------

The following describes the policies in DM-ioband. In ioband_ctr, policy_init is called to initialize the specified policy. Currently, ioband has the following policies: Default, weight, weight-iosize, and range-BW.

Weight policy: weight-based bio allocation policy, which is analyzed one by one based on the corresponding methods.

DP-> g_group_ctr/DB-> g_group_dtr = yy_weight_ctr/policy_weight_dtr: Create/destroy a weight-based group.

DP-> g_set_param = policy_weight_param: Call set_weight and init_token_bucket to set the weight value. SlaveSet_weightWe can see that ioband_group is organized by rbtree. If ioband_group-> c_parent = NULL, it indicates that this is the default group or the new root group of the group type, therefore, the ioband_device parameter g_root_groups, g_token_bucket, and g_io_limit are used for initialization. Otherwise, the ioband_group is the child of another ioband_group (this configuration is rare). Therefore, the ioband_group parameter is used.

DP-> g_should_block = is_queue_full: Whether it exceeds ioband_group-> c_limit to determine whether the queue is full

DP-> g_restart_bios = make_global_epoch: The ioband_group on this ioband_device has used up the token and calls this function to allocate a new token.

DP-> g_can_submit = is_token_left: Check whether there are remaining tokens (by calculating iopriority). First, you can view ioband_group-> c_token, next, you can check whether the epoch of the iobnad_group lags behind the epoch of the entire ioband_device (EPOCH plus 1 indicates that a token is refreshed, which is an auto increment). If yes, you can add all the new tokens since the epoch (nr_epoch * ioband_group-> c_token_initial), re-calculate the iopriority, and return the result.
PS. Here we can also see that the priority of an IO request is related to the number of remaining tokens of the group and the number of initial tokens of the group, the advantage is that I/O requests on groups with low token numbers will not be starved to death.

DP-> g_prepare_bio = prepare_token: For the weight policy, each Io request consumes one token. Prepare_token will call consume_token. consume_token will update ioband_group-> g_dominant and Token-> g_expired, subtract ioband_group-> c_token, and add ioband_group-> c_consumed. The value is 1. Here is a g_yield_mark for gracefully output Io. I will not talk about it here. For details, see the source code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.