Analysis of read/write processes in the MD module-1

Source: Internet
Author: User
Tags quiesce
Md is a virtual device driver layer. it is a block device driver and has the characteristics of a block device driver. Therefore, it implements the block device operation interface staticstructblock_device_operationsmd_fops {. ownerTHIS_MODULE,. openmd_open,. re...


Md is a virtual device driver layer. it is a block device driver and has the characteristics of a block device driver. Therefore, it implements block device operation interfaces.

Static struct block_device_operations md_fops =


. Owner = THIS_MODULE,

. Open = md_open,

. Release = md_release,

. Ioctl = md_ioctl,

. Getgeo = md_getgeo,

. Media_changed = md_media_changed,

. Revalidate_disk = md_revalidate,



(Most operations on MD are implemented using the md_ioctl interface. a few operations can also be implemented by the sys file system)

In my understanding, I can divide the MD module into two parts: one is the control management part and the other is the raid level implementation part. The control management part is a big framework. it controls various raid-level modules. how are they linked? Everyone who has read the code knows that it is the struct mdk_personality defined in md_k.h. this struct mainly defines some function operation sets (as defined in elevator in io scheduling ), these functions are implemented at each raid level. Some raid-level functions are not implemented. The structure content is as follows:

Struct mdk_personality


Char * name;

Int level;

Struct list_head list;

Struct module * owner;

Int (* make_request) (request_queue_t * q, struct bio * bio );

Int (* run) (mddev_t * mddev );

Int (* stop) (mddev_t * mddev );

Void (* status) (struct seq_file * seq, mddev_t * mddev );

/* Error_handler must set-> faulty and clear-> in_sync

* If appropriate, and shocould abort recovery if needed


Void (* error_handler) (mddev_t * mddev, mdk_rdev_t * rdev );

Int (* hot_add_disk) (mddev_t * mddev, mdk_rdev_t * rdev );

Int (* hot_remove_disk) (mddev_t * mddev, int number );

Int (* spare_active) (mddev_t * mddev );

Sector_t (* sync_request) (mddev_t * mddev, sector_t sector_nr, int * skipped, int go_faster );

Int (* resize) (mddev_t * mddev, sector_t sectors );

Int (* check_reshape) (mddev_t * mddev );

Int (* start_reshape) (mddev_t * mddev );

Int (* reconfig) (mddev_t * mddev, int layout, int chunk_size );

/* Quiesce moves between quiescence states

* 0-fully active

* 1-no new requests allowed

* Others-reserved


Void (* quiesce) (mddev_t * mddev, int state );



Make_request: the block device processing request function.

Run: start functions of each raid module, such as memory allocation and thread creation.

Stop: stop functions of raid modules to release resources.

Status:/proc file system interface

Error_handler: interface for processing read/write errors

Hot_add_disk: During the reconstruction process, add the hot spare disk to the array.

Hot_remove_disk: interface for removing invalid disk during Reconstruction

Spare_active: activate the hot spare interface

Check_reshape: Array extension check interface

Start_reshape: Start the extension interface


These functions are registered when the raid module is loaded.

After talking about this, I haven't talked about the subject yet... Next, let's proceed to the topic. The md device in Linxu can be accessed through the file system. when a MD device is created, User-mode read/write requests are sent to the MD device through the file system. Anyone familiar with the linxu kernel knows that the function used to forward requests is generic_make_request. This function will eventually call the make_request_fn method of the queue. if it is sent to the MD device, this method is implemented by the make_request function in the md layer (defined in struct mdk_personality and implemented by each raid module ). The implementation of make_request varies depending on the raid algorithm definition. In essence, the make_request on the md layer resends the bio sent by the file system to each disk based on the raid algorithm. While the make_request function in the md layer will eventually call generic_make_request to issue bio. if the object to be delivered is a specific physical device, the make_request_fn method will be implemented by the system's _ make_request, enter the io scheduling layer (I will analyze it in a later section ).

Here we take raid5 as an example to analyze how make_request forwards bio. Other raid algorithms are not described here ..

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.