ATA Drive Frame

Source: Internet
Author: User
Tags error handling sleep

The ATA disk mentioned here contains two main classes: the traditional parallel ATA (PATA), the IDE interface, and the current popular Serial ATA (SATA). For IDE drives, the Linux-2.6.28 is also retained, which can be driven into traditional HD devices and can be driven into popular SD devices. For SATA devices, the standard practice of Linux is to drive the SD device, which compares the drive architecture of the traditional ATA with the current drive architecture of the popular ATA.

Traditional ATA drive frame as shown above, the traditional ATA host architecture on the PCI bus is enumerated during the PCI bus scan. The PCI scanner scan to ATA host loads the driver for the device, ATA host Driver, which is also a driver for a PCI device. The ATA host is registered to the IDE core level drive layer to generate an IDE's bus, and the IDE core layer scans the host after the ATA host is initialized and loads the IDE device Driver that is appropriate for the appliance. If IDE device driver is a driver of an IDE disk, the ATA disk will be turned into an HD device, and if the driver is IDE-SCSI, the ATA device will be virtual as a SCSI host, and the host will be added to the SCSI Middle Leve drive layer, the same principle, the SCSI middle level driver layer will scan this virtual SCSI host, and then load the scan to get the driver of the device, this driver is usually the SCSI disk driver, at this time, A traditional IDE device was driven into a SCSI device. From the above driver stack, we can see that the key to the IDE device is virtual SCSI device Driver, in this layer to the device virtualization processing, the formation of a virtual SCSI bus, and then the device virtual into a SCSI device, according to this idea, We can continuously extend the device virtual and bus cascade.

From the above driver framework, the role of the IDE bus layer is not very large, so you can completely discard the IDE bus layer, directly using the drive frame as shown in the figure below, which is the current driving model of SATA and other drivers commonly used.

In the drive model above, the ATA host enumeration process is consistent with the first model, but the ATA host driver directly registers the ATA host to the SCSI middle level layer, considering the differences between the ATA protocol layer and the SCSI protocol layer, The Libata drive acts as a conversion layer between the SCSI middle level and the ATA host, allowing the ATA host to be directly integrated into the SCSI drive system, directly driving the ATA device to SCSI devices. Compared with the first model, the drive stack of this model is lighter, the drive efficiency is increased, and the ATA drive can be seamlessly integrated into the SCSI drive system. In this drive model, the Libata drive is undoubtedly the biggest contributor. Currently, this model is used by many SATA host drivers and PATA host drivers.

The SCSI disk device driver submits a request to the SCSI host via the SCSI middle level layer, and SCSI disk driver is a block device driver that calls the UNPLUG_FN function of the block device layer to handle the SCSI disk request queue. The SCSI middle Level registration SCSI_REQUEST_FN function is invoked during the processing of SCSI disk request queues for specific operations. The SCSI_REQUEST_FN function takes a request from the IO dispatch queue, converts the request to a SCSI command, and finally calls the Scsi_dispatch_cmd function to submit the SCSI command to the SCSI host. The interface between the SCSI host and the SCSI middle level layer is the Queue_command function, and each SCSI host driver registers a specific Queue_command method with the SCSI middle level layer. Because the Queue_command function executes in a non-sleeping context, it cannot handle complex operations, and the usual operation is to place the received SCSI commands in the processing queue maintained by the SCSI host. If a real SCSI host is not a virtual host, then the SCSI command can be transferred to SCSI disk via DMA in the SCSI host layer. The above process completes an IO request submission process, for a device such as a disk, in this process need to take into account the characteristics of the storage media and the characteristics of the application access mode, so the need to do some IO scheduling strategy, so that the SCSI disk read and write more satisfied with the characteristics of the storage media. It is also possible to implement a more advanced IO management strategy on the upper layer of SCSI disk. The submission of an IO request can be understood as the first half of the entire IO process, and then the second half is the completion of the IO callback process, which analyzes the specific implementation of the IO callback path in Linux.


When an IO event is complete, SCSI disk notifies the SCSI host driver in an interrupted manner. When a SCSI host interrupt event occurs, the CPU executes the host Interrupt service program, usually the actual SCSI host will be in the form of a PCI device, in consideration of the interruption sharing problem, in the Interrupt service program first need to make a judgment of the interrupt event, and then based on the SCSI The status register of host makes the processing of specific interrupt tasks. For read/write IO requests, the DMA end interrupt signal is generated after the data DMA to SCSI disk, and the Scatter-gather DMA technology can be used in the DMA process, so this process does not involve the memory copy of the data, that is, during the read/write IO process, The data is always in the Bio's page page (the processing mechanism is more efficient when the data in the page is DMA to disk directly during the writing process, and the data in the disk is DMA directly to the bio page in the process of reading the data). When host determines the completion request, it calls the SCSI middle level callback function, which is the famous Scsi_done. The Scsi_done is submitted to the SCSI host layer during the Queue_command process. The Blk_complete_request function is called directly in the Scsi_done function, which triggers a soft interrupt of SCSI by Raise_softirq_irqoff (BLOCK_SOFTIRQ). So far, the above process has been performed in the upper half of the interrupt of the SCSI host. The upper half of the interrupt should not run too long, or it will cause the interruption event to be lost. After the soft interrupt is triggered, the upper half of the interrupt can be exited. After exiting the upper half, the CPU will be handed over to the SCSI soft interrupt service that has been triggered, and you can see that the soft interrupt service is still running in the interrupt context, not a context that can be dispatched.



The execution function of the soft interrupt is BLK_DONE_SOFTIRQ, because it is the interrupt event caused by the SCSI command, it will invoke the Scsi_softirq_done function which is registered to the request queue beforehand, complete the specific SCSI soft interrupt the lower half of the event processing. In this function, some SCSI command execution of the correctness of the judgment, if the command execution error, then you can retry the command requeue processing, when the retry to a certain extent, will execute the wrong SCSI command to the SCSI error handling kernel daemon, Make a final decision; If successful, call the Scsi_finish_command function to end the SCSI command. Call the Scsi_io_completion function at the end of the block-level IO request in the Scsi_finish_command function, call the Scsi_end_request function, and then call the Blk_end_request function, Finally, the Blk_end_io function is transferred. In the Blk_end_io function, all the bio on request is ended, and the process of ending bio can call the Bio_endio function. Release the request resource after all the bio in the request has ended. At this point, after the request of a bio is processed by the SCSI disk, it has been completely processed by interrupting the upper and lower half of the interrupt. It is important to note that all of the callback procedures for IO are handled in the context of the interrupt, so you need to be aware of the sleep problem when writing the callback function for IO, you need to consider the possible sleep of memory allocation, and the use of semaphores can cause sleep, which causes the system to crash.



Through the above analysis, the normal IO callback path of SCSI disk involves the following function description: Scsi_doneàblk_done_softirqàscsi_softirq_doneàscsi_finish_commandàscsi_io_ Completionàscsi_end_requestàblk_end_ioàbio_endio.



Linux recently changed significantly in SCSI middle level, which is based on the newer Linux-2.6.28 version, and the IO callback process has changed somewhat compared to the previous version of 2.6.18.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.