22.linux-Block Device driver Framework detailed analysis (details)

Source: Internet
Author: User
Tags function prototype

1. All we learned before was the character device driver, just remember.

Character Device driver:

When our application layer reads and writes (read ()/write ()) character device drivers, the data is read and written by Byte/character, there is no buffer, because the data volume is small, can not randomly read data, such as: Keys, LED, mouse, keyboard, etc.

2. The next section begins with the block device driver

Block device:

Block device is a class of I/O devices, when our application layer to read and write to the device, is the sector size to read and write data, if the data read and write less than the size of the sector, you will need a buffer, you can randomly read and write data at any location of the device, such as ordinary files (*.TXT,*.C, etc.), hard disk ,

3. Block Device structure:

    • segment (segments): consists of several blocks. is part of a memory page or a memory page in the Linux memory management mechanism.
    • block (Blocks): The basic unit for data processing by Linux, such as the kernel or file system. Typically consists of one or more sectors. (for Linux operating systems)
    • sector (sectors): The basic unit of a block device. Typically between 512 bytes and 32768 bytes, the default is 512 bytes

4. We use TXT document as an example, to briefly analyze the next block device flow:

For example: When we want to write a small data to a TXT file location, because the block device write data is sector-based, but can not destroy other locations in the TXT file, then introduced a "buffer" to read all the data into the buffer, and then modify the cached data, Then put the entire data into a TXT file corresponding to a sector, when we write to the TXT file very small data, then will be repeated to the sector read out, write, this will waste a lot of time on the read/write hard disk, so the kernel provides a queue mechanism, and then not close the TXT file, Read and write requests are optimized, sorted, merged, and so on to improve the efficiency of accessing the drive

(PS: The kernel is implemented by the Elv_merge () function to optimize the queue, sort, merge, which will be analyzed later)

5. Next start analyzing the block device framework

When we write data to a *.txt, the file system translates to access to sectors on the block device, which is called the Ll_rw_block () function, from which the function begins to enter the device layer.

5.1 First to analyze the Ll_rw_block () function (/FS/BUFFER.C):

void Ll_rw_block (int rw, int nr, struct buffer_head *bhs[])//RW: Read-write flag bit,  nr:bhs[] length,  bhs[]: Data array to read and write {      int i;< C3/>for (i = 0; i < nr; i++) {      struct buffer_head *bh = bhs[i];                 Get nr of Buffer_head ...       if (rw = = WRITE | | rw = = swrite) {              if (Test_clear_buffer_dirty (BH)) {              ...              SUBMIT_BH (WRITE, BH);                Buffer_head to submit write flags   
Continue;} } else { if (!buffer_uptodate (BH)) { ... SUBMIT_BH (rw, BH); Buffer_head continue to submit other marks ; }} Unlock_buffer (BH); }}

The buffer_head structure, which is our buffer descriptor, stores various information about the buffers, as shown in the following structure:

struct Buffer_head {    unsigned long b_state;    Buffer status Flag     struct Buffer_head *b_this_page;           The buffer in the page is     struct page *b_page;           The storage buffer is located on which page    sector_t B_BLOCKNR;              Logical block number    size_t b_size;               The size of the block    Char *b_data;     Buffer in the page    struct block_device *b_bdev;         Block device, to represent a separate disk device    bh_end_io_t *b_end_io;             I/O completion method     void *b_private;   Complete method Data     struct List_head b_assoc_buffers; Related mapping list//    mapping This buffer is associated with */    struct address_space *b_assoc_map;       atomic_t B_count; Buffer usage count};

5.2 then enter SUBMIT_BH (), the SUBMIT_BH () function is as follows:

int submit_bh (int rw, struct buffer_head * bh) {struct bio *bio;      Define a bio (block input output), which is block device I/o ... bio = Bio_alloc (Gfp_noio, 1);      Assign bio/* to construct bio/Bio->bi_sector = BH-&GT;B_BLOCKNR * (bh->b_size >> 9) According to Buffer_head (BH);                              Store logical block Number Bio->bi_bdev = bh->b_bdev;           Store the corresponding block device Bio->bi_io_vec[0].bv_page = bh->b_page;              Physical page where the buffer is stored bio->bi_io_vec[0].bv_len = bh->b_size;            The size of the storage sector Bio->bi_io_vec[0].bv_offset = Bh_offset (BH);                                    The offset in bytes in the storage sector bio->bi_vcnt = 1;                                     Count value bio->bi_idx = 0;                         Index value bio->bi_size = bh->b_size;             The size of the storage sector bio->bi_end_io = End_bio_bh_io_sync;                               Set I/o callback function bio->bi_private = BH; Point to which buffer ... submIt_bio (rw, bio); Submit bio ...}

The SUBMIT_BH () function constructs the bio through BH and then calls Submit_bio () to submit the bio

The 5.3 Submit_bio () function is as follows:

void Submit_bio (int rw, struct bio *bio) {       ...       Generic_make_request (bio);        }

Finally call Generic_make_request (), the bio data submitted to the corresponding block device request queue, the Generic_make_request () function is mainly to achieve the bio of the commit processing

The 5.4 generic_make_request () function is as follows:

void Generic_make_request (struct bio *bio) {if (current->bio_tail) {                   //Current->bio_tail not empty, indicates that there is a bio being submitted              * (current->bio_tail) = bio;     Place the current bio in the previous bio->bi_next              bio->bi_next = NULL;    Update bio->bi_next=0;              Current->bio_tail = &bio->bi_next; Then put the current bio->bi_next into the current->bio_tail so that the next bio will be placed in the current bio->bi_next.
return; }
BUG_ON (bio->bi_next); do { current->bio_list = bio->bi_next; if (Bio->bi_next = = NULL) current->bio_tail = &current->bio_list; else bio->bi_next = NULL; __generic_make_request (bio); Call __generic_make_request () to submit bio Bio = current->bio_list; } while (bio); Current->bio_tail = NULL; /* Deactivate */}

From the comments and code above, it is possible to call __generic_make_request () only if current->bio_tail is null for the first time you enter Generic_make_request ().

__generic_make_request () first obtains the request queue Q by the bio corresponding Block_device, then checks the corresponding device is not the partition, if is the partition the sector address to recalculate, finally calls the Q member function make_ Request_fn completed bio's submission.

The 5.5 __generic_make_request () function is as follows:

static inline void __generic_make_request (struct bio *bio) {request_queue_t *q;    int ret;   ... ...       do {              q = bdev_get_queue (Bio->bi_bdev);  Get request queue by Bio->bi_bdev Q              ...              ret = Q->MAKE_REQUEST_FN (q, bio);             Submit Application Queue Q and Bio       } while (ret);
}

What is the function of this q->make_request_fn ()? What exactly did we search under where it was initialized

For example, search for MAKE_REQUEST_FN, which is initialized in the Blk_queue_make_request () function MFN this parameter

Continue searching for Blk_queue_make_request, find out who it was called, and what the MFN parameter was assigned to

For example, find it is called in the Blk_init_queue_node () function

Finally Q->MAKE_REQUEST_FN () executes the __make_request () function

5.6 Let's take a look at the __make_request () function, what did the request queue Q and bio do for the submission

static int __make_request (request_queue_t *q, struct bio *bio) {  struct request *req;          The queue of the block device itself ...  //(1) The previous application queue Q and the incoming bio, by sorting, merging in its own req queue  El_ret = Elv_merge (q, &req, bio);  ... ...  Init_request_from_bio (req, bio);        Merge failed, separate bio into Req queue  add_request (q, req);                  Separate the previous application queue Q into the Req queue ...  __generic_unplug_device (q);      (2) Execute the processing function of the request queue      }

1) The Elv_merge () function above is the elevator algorithm in the kernel (elevator merge), which is similar to the elevator we sit on, through a sign, up or down.

For example, the application queue has the following 6 applications:

4 (in), 2 (out), 5 (in), 3 (out), 6 (in), 1 (out) //Where in: Write out queue-to-sector, OU: Read-in queue   

The final execution, will be sorted merge, first write 4,5,6, queue, and then read into the third-line queue

2) the __generic_unplug_device () function above is as follows:

void __generic_unplug_device (request_queue_t *q) {      if (unlikely (blk_queue_stopped (q)))              return;       if (!blk_remove_plug (q))              return;       Q->REQUEST_FN (q);         }

The member REQUEST_FN () function that executes Q finally executes the processing function of the request queue

6. The framework of this section summarizes the analysis, as shown in the following:

7. Where Q->request_fn is a request_fn_proc struct, as shown in:

7.1 How did this application queue Q->request_fn come from?

We refer to our own block device driver DRIVERS\BLOCK\XD.C

In the entry function, I found this sentence:

static struct Request_queue *xd_queue;             Define an application queue Xd_queuexd_queue = Blk_init_queue (Do_xd_request, &xd_lock);       Assign a request queue

Where the blk_init_queue () function prototype is as follows:

Request_queue *blk_init_queue (Request_fn_proc *rfn, spinlock_t *lock);//  *RFN:REQUEST_FN_PROC structure, A spin lock (spinlock) used to execute the handler function//  *lock: Queue access permission in the request queue, which needs to be defined by the Define_spinlock () function

It is clear that do_xd_request () is hanging in the xd_queue->request_fn. Then return to this request_queue queue

7.2 Let's see how the processing function of the application queue do_xd_request () is handled with the following functions:

static void Do_xd_request (request_queue_t * q) {  struct request *req;              if (xdc_busy)      return;  while ((req = elv_next_request (q))! = NULL)    //(1) While getting request queue to process request  {    int res = 0;    ... for (retry = 0; (Retry < xd_retries) &&!res; retry++)                 res = xd_readwrite (rw, disk, Req->buffer, block, count);
The buffer member that obtains the request Req reads and writes to the disk sector, and when read and write fails returns 0, successfully returns 1
End_request (req, res); The application in the queue has been processed to the end, when res=0, indicating read and write failed }}

(1) Why do you want while always get?

Because this q is an application queue, there will be multiple applications, before the use of the elevator algorithm Elv_merge () function merging, so get also through the elevator algorithm elv_next_request () function to obtain.

with the above code and comments, request queue Q in the kernel Ultimately, they are handed to the driver for processing, drive to read and write to sectors

8. Let's take a look at the entrance function of DRIVERS\BLOCK\XD.C, how to create a block device-driven

Static Define_spinlock (Xd_lock);  //define a spin lock to use in the application queue
static struct Request_queue *xd_queue; Define a request queue xd_queuestatic int __init xd_init (void) //Ingress function {if (Register_blkdev (Xt_disk_major, "XD")) //1. Create a block device that is saved in/proc/devices goto out1;xd_queue = Blk_init_queue (Do_xd_request, &xd_lock); 2. Assign an application queue, which is then assigned to the queue member of the Gendisk struct ... for (i = 0; i < xd_drives; i++) { ... struct Gendisk *disk = alloc_disk (+); 3. Assigning a Gendisk structure, 64: Number of secondary device number, also known as the number of partitions
/* 4. Next SET Gendisk structure */ disk->major = xt_disk_major; Set the main device number Disk->first_minor = i<<6; Set the secondary device number Disk->fops = &xd_fops; Set the block device-driven operation function disk->queue = xd_queue; Set up queue request queues for managing the device IO request queue ... Xd_gendisk[i] = disk;} ... for (i = 0; i < xd_drives; i++) Add_disk (Xd_gendisk[i]); 5. Registering the GENDISK structure}

Where the Gendisk (Universal disk) structure is used to store the device's hard disk information, including request queue, partition list and block device operation function set, and so on, the structure is as follows:

struct Gendisk {  int major;  /* Device main device number */  int first_minor;         /* Start Device number */  int minors;   /* Number of secondary device number, also known as the number of partitions, if the value is changed to 1, the partition */  char disk_name[32];    /* Device name */  struct hd_struct **part;  /* Information for partition table */  int part_uevent_suppress;  struct block_device_operations *fops;  /* block device operation set */  struct request_queue *queue;           /* Request queue, a pointer to manage the device IO request queue */  void *private_data;                    /* Private Data */  sector_t capacity;                     /* Number of sectors, 512 bytes of 1 sectors, description of device capacity */  ...    };                

9. So registering a block device driver requires the following steps:

    1. Create a block device
    2. Assign a request queue
    3. Assigning a Gendisk structure
    4. To set a member of the GENDISK structure body
    5. Registering Gendisk Structures

Original: https://www.cnblogs.com/lifexy/p/7651667.html

22.linux-Block Device driver Framework detailed analysis (details)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.