Linux high-speed buffer zone principle

Source: Internet
Author: User
Tags dashed line modifier

File system-high-speed buffer zone:

First, why do we need a buffer zone instead of directly accessing the data in the block device. This is because the read and write speeds between IO devices and memory do not match and there is a bit of data that needs to be written or read out of disk to access the disk, the disk will quickly become corrupted, and the buffer zone plays a central role in the high-speed buffer zone where data is needed to read the data in the buffer zone , the match succeeds, then take the data directly from the buffer zone, then the kernel again to operate, if you want to deposit data, it is first through the buffer, then into the disk. This avoids the operation of the IO device every time.

The location of the buffer zone throughout the physical memory is between the kernel area and the main memory area. Here is a reference to the figure in the full comment of the linux0.11 code.

Inside the buffer zone, there are two parts, one is the buffer head structure and the other is the buffer block . The size of each buffer block is the same as the size of the disk logical block on the block device, and the buffer header structure is used to connect the buffer block and set some properties. Structure

So how does the kernel correspond to the physical device when it uses the buffer block? For example, to write some data to a device, stored in a buffer block, how the buffer block to write data to disk. The answer is that the block device number and the logical block number of the buffered data are stored in the buffer header structure, together they uniquely confirm the block device and data block corresponding to the buffer block data . And in order to quickly see if the data block is in the buffer, using the hash table structure and the idle buffer block queue for operation and management , the hash function used in linux0.11 is: #define _hash(dev, block) ((unsigned)(dev^block))%NR_HASH . Nr_hash is the length of the hash array. Structure

In the diagram,A bidirectional arrow represents a hash of the two-way linked list pointer, corresponding, and field in the same table item b_prev b_next . The dashed line represents the list pointer that is currently connected between the idle buffer blocks in the free buffer block list, and Free_list is the head pointer of the idle list.
Using in the kernelgetblk()function to get the appropriate buffer block. The function callsget_hash_table()function to confirm the existence of a buffer block for the specified device number and logical block number in the hash table.if present, returns the pointer to the corresponding buffer head structure directly .。 Otherwise, the entire idle list is scanned from the idle link header, looking for an available free buffer. It is possible to use more than one free buffer, and then it is necessary to determine which free block is most suitable according to the weight of the combination of the modifier flags and the lock flags of the buffer head structure.if the found free block is not modified and is not locked, then the free block is used.。 If no free blocks are found, let the current process enterSleeping, and then look again when you continue to execute. If the free block is locked, the current process also needs to go to sleep and wait for other processes to unlock. If the buffer block is occupied by another process during the sleep wait, it will need to restart the search buffer. If it is not occupied by other processes,To determine if the buffer block has been modified (not yet written to the disk), if it has been modified, the block is written and waits for the block to be unlocked。 At this point, if the buffer block is also occupied by other processes, it is only to re-find the free buffer block. There is also an unexpected situation, that is, in the current process of sleep,other processes have added the buffer blocks we need to the hash queue, so we need to search the hash queue for the last time ., if the buffer block is found in the hash queue, you have to re-perform the above operation. Finally, we get a block of free buffer blocks that are not referenced by the process and are not locked and not modified, and the reference count of the blocks is set to 1, and the other flags are reset, and the buffer header structure is removed from the free table.after setting the device number and corresponding logical number of the buffer block, put the buffer header structure into the hash table corresponding to the table entry header and the idle queue tail。 Finally, a pointer to the buffer size is returned. Flow chart

getblkThe function may return a new free block or a buffer block containing the data we need. Therefore, to read the data block operation function bread() , it is necessary to determine the buffer block update flag, already know whether the contained data is valid, if valid directly return to the process, otherwise call the underlying block read and write function ll_rw_block() , and sleep at the same time, waiting for data from the physical device write buffer block, After waking up and then re-judging whether it is valid, if not, then release the buffer block and return null. Flow chart

When the program wants to release a buffer block, it calls the brelse() function, frees the buffer block, and wakes up the process of sleeping because it waits for the buffer block.

Finally, in addition to the driver, other upper-level programs to read and write to block devices need to go through a high-speed buffer zone management program to achieve data read and write . The links between them are mainly bread() implemented through functions and ll_rw_block() functions. :


  1. Memory_end = (1<<20) + (EXT_MEM_K<<10);
  2. Memory_end &= 0xfffff000;
  3. if (Memory_end > 16*1024*1024)
  4. Memory_end = 16*1024*1024;
  5. if (Memory_end > 12*1024*1024) //memory >12m set high buffer size 4M
  6. Buffer_memory_end = 4*1024*1024;
  7. else if (Memory_end > 6*1024*1024) //memory >6m set high buffer size 2M
  8. Buffer_memory_end = 2*1024*1024;
  9. Else
  10. Buffer_memory_end = 1*1024*1024; //Otherwise set the buffer size to 1M
  11. Main_memory_start = Buffer_memory_end;
  12. Ifdef RAMDISK
  13. Main_memory_start + = Rd_init (Main_memory_start, ramdisk*1024);
  14. endif

/FS/BUFFER.C initialization function Buffer_init ()

  1. struct Buffer_head *h = start_buffer;
  2. void *b;
  3. int i;
  4. if (buffer_end = = 1<<20) //If the memory end is 1M, it is necessary to reduce the memory between the video and the BIOS occupied 640k--1m
  5. b = (void *) (640*1024);
  6. Else
  7. b = (void *) Buffer_end;
  8. This code initializes the buffer, establishes the free buffer ring list, and obtains the number of buffer blocks in the system.
  9. The process of operation is to start dividing the buffer block of 1K size from the high end of the buffers, while at the lower end of the buffer the buffer block is established
  10. The structure of the buffer_head, and these buffer_head form a doubly linked list.
  11. H is a pointer to the buffer head structure, while the h+1 is the next buffer header address contiguous to the memory address, which can also be said to be a point to H
  12. Outside the end of the buffer head. To ensure that there is enough memory to store a buffer header structure, a block of memory pointed to by B is required
  13. Address >= h The end of the buffer head, that is, to >=h+1.
  14. While ((b-= block_size) >= ((void *) (H + 1) )
  15. {
  16. H->b_dev = 0;  //Use the device number for this buffer.
  17. H->b_dirt = 0;  //Dirty flag, also known as buffer modifier flag.
  18. H->b_count = 0;  //The buffer reference count.
  19. H->b_lock = 0;  //Buffer lock flag.
  20. h->b_uptodate = 0;  //Buffer update flag (or data valid flag).
  21. h->b_wait = NULL;  //point to the process waiting for the buffer to be unlocked.
  22. H->b_next = NULL;  //points to the next buffer header with the same hash value.
  23. H->b_prev = NULL;  //points to the previous buffer header with the same hash value.
  24. H->b_data = (char *) b;  //points to the corresponding buffer data block (1024 bytes).
  25. H->b_prev_free = h-1;  //point to the previous item in the list.
  26. H->b_next_free = h + 1;  //point to the next item in the list.
  27. h++;  //h refers to the position of a new buffer head downward.
  28. nr_buffers++;  //Buffer block count cumulative.
  29. if (b = = (void *) 0x100000) //If address B is decremented to equal to 1MB, then 384KB is skipped,
  30. b = (void *) 0xa0000;  //Let B point at address 0xa0000 (640KB).
  31. }
  32. h--;  //Let H point to the last valid buffer head.
  33. Free_list = Start_buffer;  //Let the idle list head point to the head of a buffer header.
  34. Free_list->b_prev_free = h;  //The B_prev_free of the list head refers to the forward one (i.e. the last item).
  35. H->b_next_free = Free_list;  The next pointer to//h points to the first item, forming a loop chain.
  36. Initialize the hash table (Hashtable, hash list), and all pointers in the table are null.
  37. For (i = 0; i < Nr_hash; i++)
  38. Hash_table[i] = NULL;

Linux high-speed buffer zone principle

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.