5.4. Interaction between devices and the interaction between the kernel device and the kernel

Source: Internet
Author: User

Directory: http://www.cnblogs.com/WuCountry/archive/2008/11/15/1333960.html
 
[Do not provide illustrations. You are advised to download the source book from the Internet]

5.4. Interaction between devices and the interaction between the kernel device and the kernel
Nearly all devices (including protocols) interact with the kernel in one of two ways:
Almost all devices (including NICS) can interact with the kernel in two ways:

Polling

Driven on the kernel side. The kernel checks the device status at regular intervals to see if it has anything to say.
In kernel-driven mode, the kernel actively detects the status of the device and checks whether the device has content to output.

Interrupt

Driven on the device side. The device sends a hardware signal (by generating an interrupt) to the kernel when it needs the kernel's attention.
In device driver mode, when a device requires kernel attention, the device sends a hardware signal (usually through an interrupt) to the kernel.

in chapter 9, you can find a detailed discussion of NIC driver design alternatives as well as software interrupts. you will also see how Linux can use a combination of polling and interrupts to increase performance. in this chapter, we will look only at the interrupt-based case.
in Chapter 9th, you can see a detailed discussion about how Nic drivers can be selectively designed into good software interrupt modes. You can also see how Linux combines polling and interruption to improve performance. In this chapter, I will only talk about interruptions.

I won't go into detail on how interrupts are reported by the hardware, the difference between hardware exceptions and device interrupts, how the driver and bus kernel infrastructures are designed, etc .; you can refer to Linux Device Drivers and understanding the Linux kernel for those topics. but I'll give a brief overview on interrupts to help you understand how device drivers initialize and RE Gister the devices they are responsible for, with special attention to the networking aspect.
I will not explain in detail how interruptions are reported from hardware to the kernel, the differences between hardware exceptions and device interruptions, and how drivers and kernel bus are designed at the underlying layer. For more information, see Linux Device Drivers and understanding the Linux kernel. However, I will briefly describe the interruption to help you understand the device driver initialization and the registration that allows the device to respond, all of which focus only on network features.

5.4.1. hardware interrupts interruption
you do not need to know the low-level background about how hardware interrupts are handled. however, there are details worth mentioning because they can make it easier to understand how Nic device drivers are written, and therefore how they interact with the upper networklayers.
you must have some background knowledge about hardware underlying interrupt processing. Obviously, their details are worth noting, because this will make it easier for you to understand how Nic device drivers should be written and how they interact with the network layer.

Every interrupt runs a function called an interrupt handler, which must be tailored to the device and therefore is installed by the device driver. typically, when a device driver registers an Nic, it requests and assigns an IRQ. it then registers and (if the driver is unloaded) unregisters a handler for a given IRQ with the following two architecture-dependent functions. they are defined in kernel/IRQ/manage. C and are overridden by architecture-specific functions in arch/XXX/kernel/IRQ. c, where XXX is the architecture-specific directory:
Each interrupt is run by calling a function called interrupt processing. These must be available through the installed device driver.ProgramAccording to the device. Typically, when a device driver registers as a nic, it requests to assign an IRQ. Then it registers and unregisters (if the device is uninstalled) IRQ using the following two functions based on the system architecture. These two functions are defined in kernel/IRQ/manage. c, and they are in different architectures in arch/XXX/kernel/IRQ. c, where XXX is the specified architecture directory:

Int request_irq (unsigned int IRQ, void (* Handler) (INT, void *, struct pt_regs *), unsigned long irqflags, const char * devname, void * dev_id)

This function registers a handler, first making sure that the requested interrupt is a valid one, and that it is not already allocated to another device unless both devices understand shared irqs (see the later section "interrupt sharing ").
This function registers a handle to ensure that the request interrupt is a valid interrupt number and has not been assigned to another device, unless both devices use shared IRQ explicitly (see "interrupt sharing" in the following section ").

Void free_irq (unsigned_int IRQ, void * dev_id)

Given the device identified by dev_id, this function removes the handler and disables the IRQ line if no more devices are registered for that IRQ. note that to identify the handler, the kernel needs both the IRQ Number and the device identifier. this is especially important with shared irqs, as explained in the later section "interrupt sharing."
Dev_id indicates the unique identifier of a device. If no registered device exists on an IRQ, this function is used to delete the processing handle and disable the IRQ bus. Note: To identify the processing handle, the kernel requires both the IRQ Number and the device ID. For shared IRQ, this is even more important. It is explained in the "shared interrupt" section later.

When the kernel has es an interrupt notification, it uses the IRQ Number to find out the driver's handler and then executes this handler. to find handlers, the kernel stores the associations between IRQ numbers and function handlers in a global table. the association can be either one-to-one or one-to-least, because the Linux kernel allows multiple devices to use the same IRQ, A feature described in the later section "interrupt sharing."
When the kernel receives an interrupt notification, it uses the IRQ Number to locate the device handle and execute it. To locate the processing handle, the kernel stores the IRQ and function handle in a global table in an associated form. This association can be one-to-one or one-to-many, because many devices in the Linux Kernel use the same IRQ, and the related features are described in the "shared interrupt" section.

In the following sewing, you will see common examples of the information exchanged between devices and drivers by means of interrupts, and how an IRQ can be shared by multiple devices under some conditions.
In the following sections, we will see some general examples of information exchange between devices and drivers, collectively referred to as interruptions, and how an IRQ is under some conditions, shared among multiple devices.

5.4.1.1. Interrupt types interrupt type
With an interrupt, an Nic can tell its driver several different things. Among them are:
In an interrupt, a NIC can feature its drive to something different. These include:

Attention of a frame receives a frame

This is the most common and standard situation. This is the most common and standard situation.

Transmission failure transfer failed

This kind of notification is generated on Ethernet devices only after a feature called exponential binary backoff has failed (this feature is implemented at the hardware level by the NIC ). note that the driver will not relay this notification to higher network layers; they will come to know about the failure by other means (timer timeouts, negative acks, etc .).
This kind of notification only happens on an Ethernet device when a request called (exponential binary backoff) fails. (This feature is implemented on the hardware layer of the NIC ). Note that the driver will not forward this notification at the higher network layer; they will learn about this failure in another way (timer timeout, rejected confirmation frames, etc ).

DMA transfer has completed successfully DMA transfer completed successfully

Given a frame to send, the buffer that holds it is released by the driver once the frame has been uploaded into the NIC's memory for transmission on the medium. with synchronous transmissions (no DMA), the driver knows right away when the frame has been uploaded on the NIC. but with DMA, which uses asynchronous transmissions, the device driver needs to wait for an explicit interrupt from the NIC. you can find an example of each case at points where dev_kfree_skb [*] is called within the driver code drivers/NET/3c59x. C (DMA) and Drivers/NET/3c509. C (non-DMA ).
When a frame is sent to the media, the cache of the frame will be released by the driver after the frame has been uploaded to the NIC memory. During synchronous transmission (non-DMA), the device correctly knows when the frame has been uploaded to the NIC device. However, in DMA, that is, when the user uses the asynchronous transmission mode, the device driver must wait for a clear Nic interruption. You can find examples of calling dev_kfree_skb in drivers/NET/3c59x. C and Drivers/NET/3c509. C respectively.

[*] Chapter 11 describes this function in detail. This function is described in Chapter 11th.

Device has enough memory to handle a new transmission the device already has enough memory to handle new transfers.

it is common for an Nic device driver to disable transmissions by stopping the egress queue when that queue does not have sufficient free space to hold a frame of maximum size (e.g ., 1,536 bytes for an Ethernet NIC ). the queue is then re-enabled when memory becomes available. the rest of this section goes into this case in more detail.
For nic device drivers, when the queue does not have enough free space to store a maximum byte frame (for example, 1536 bytes on the Ethernet NIC ), It is very common to disable transmission through the exit queue. When the queue has enough memory, the device will be re-enabled. This topic will be discussed in detail later in this section.

the final case in the previous list covers a sophisticated way of Throttling transmissions in a manner that can improve efficiency if done properly. in this system, a device driver disables transmissions for lack of queuing space, asks the NIC to issue an interrupt when the available memory is bigger than a given amount (typically the device's maximum transmission unit, or MTU), and then re-enab Les transmissions when the interrupt comes.
the last case listed above includes a mature method that can improve efficiency if the processing is correct. In this system, a device driver disables data transmission in a queue with no space, and asks the NIC to confirm an interruption. When there is a memory space larger than the specified amount (usually the maximum transmission unit of the device, or MTU), and then re-enable data transmission when the interruption arrives.

A device driver can also disable the egress queue before a transmission (to prevent the kernel from generating another transmission request on the device ), and re-enable it only if there is enough free memory on the NIC; if not, the device asks for an interrupt that allows it to resume transmission at a later time. here is an example of this logic, taken from the el3_start_xmit routine, which the drivers/NET/3c509. c driver installas its hard_start_xmit function in its net_device structure:
A device driver can also disable data transmission in the egress Queue (to prevent the kernel from generating another transmission request on the device), and re-enable the function only when the NIC has enough memory; otherwise, the device requests an interruption, which makes it possible to resume data transmission at a later time. For this logic, here is an el3_start_xmit example obtained from drivers/NET/3c509. C. The device driver uses a function similar to hard_start_mit in its device data structure:

The hard_start_xmit virtual function is described in Chapter 11.
The hard_start_xmit virtual function is discussed in Chapter 11th.

static int
el3_start_xmit (struct sk_buff * SKB, struct net_device * Dev)
{< br> .........
netif_stop_queue (Dev);
.........
If (inw (ioaddr + tx_free)> 1536)
netif_start_queue (Dev);
else
outw (settxthreshold + 1536, ioaddr + el3_cmd );
.........
}

the driver stops the device queue with netif_stop_queue, thus inhibiting the kernel from submitting further transmission requests. the driver then checks whether the device's memory has enough free space for a packet of 1,536 bytes. if so, the driver starts the queue to allow the kernel once again to submit transmission requests; otherwise, it instructs the device (by writing to a configuration Register with an outw call) to generate an interrupt when that condition will be met. an interrupt handler will then re-enable the device queue with netif_start_queue so that the kernel can restart transmissions.
the driver uses netif_stop_queue to stop the device queue, thus inhibiting later kernel transmission requests. The driver then checks whether the device has enough memory to store a 1536-byte packet. In this case, the device starts the queue so that the kernel can submit data transmission requests again. Otherwise, it indicates that the device (write an outw call to write a configuration register) has an interruption when this condition is met. Then a interrupt handle uses netif_start_queue to re-enable the device queue, so that the kernel can restart data transmission.

The netif_xxx_queue routines are described in the section "enabling and disabling transmissions" in Chapter 11.
The netif_xxx_queue function is described in "transfer enable and de-enable" in Chapter 11th.

5.4.1.2. Interrupt sharing
IRQ lines are a limited resource. A simple way to increase the number of devices a system can host is to allow multiple devices to share a common IRQ. normally, each driver registers its own handler to the kernel for that IRQ. instead of having the kernel receive the interrupt notification, find the right device, and invoke its handler, the kernel simply invokes all the handlers of those devices that registered for the same shared IRQ. it is up to the handlers to filter spurous invocations, such as by reading a registry on their devices.
IRQ is a limited resource. A simple method can increase the number of devices that the system can provide, that is, many devices share a common IRQ. After the kernel receives an interrupt notification, it finds the correct device and calls its handle. After sharing IRQ, instead, the kernel simply calls all the handles registered on the same IRQ device. This means that the handle is used to filter spoofing behaviors, such as reading a register on their devices.

for a group of devices to share an IRQ line, all of them must have device drivers capable of handling shared irqs. in other words, each time a device registers for an IRQ line, it needs to explicitly say whether it supports interrupt sharing. for example, the first device that registers for One IRQ, saying something like "assign me irq n and use this routine FN as the handler," must also specify Whether it is willing to share the IRQ with other devices. when another device driver tries to register the same IRQ Number, it is refused if either it, or the driver to which the IRQ is currently assigned, is incapable of sharing irqs.
to allow a group of devices to share an IRQ line, all devices must have a handle that can be processed on the shared IRQ. That is to say, each time an IRQ is registered, it must show whether it supports shared interruptions. For example, when the first device registers an IRQ, it declares like this: "assign me an irq n and use this handle as my call function ", you must also specify whether to share the interrupt number with other devices. When another device registers the same interrupt number, if the device does not use sharing, it will be rejected, or, A device that has already registered the interrupt number cannot share it.

5.4.1.3. Organization of irqs to handler mappings IRQ handle ing Organization
The mapping of irqs to handlers is stored in a vector of lists, one list of handlers for each IRQ (see Figure 5-2 ). A list of primary des more than one element only when multiple devices share the same IRQ. the size of the vector (I. E ., the number of possible IRQ numbers) is architecture dependent and can vary from 15 (on an x86) to more than 200. with the introduction of Interrupt sharing, even more devices can be supported on a system at once.
The handle mapped to IRQ is stored in a vector linked list. Each IRQ (see Figure 5-2) has a linked list of handles. A linked list contains more elements only when multiple devices share the same IRQ. The vector size (for example, the number of available IRQ values) is related to the architecture and can change from 15 (in the X86 architecture) to more than 200. As described in interrupt sharing, more devices may be supported in one system at a time.

The section "Hardware interrupts" already introduced the two functions provided by the kernel to register and unregister a handler, respectively. Let's now see the data structure used to store the ings.
In the hardware interrupt section, we have introduced two functions provided by the kernel for registering and unregistering a handle. Next let's take a look at the storage of ing data structures.

mappings are defined with irqaction data structures. the request_irq function introduced in the earlier section "Hardware interrupts" is a wrapper around setup_irq, which takes an irqaction structure as input and inserts it into the global irq_desc vector. irq_desc is defined in kernel/IRQ/handler. C and can be overridden in the per-architecture files ARCH/XXX/kernel/IRQ. c. setup_irq is defined I N kernel/IRQ/manage. C and can be overridden in the per-architecture files ARCH/XXX/kernel/IRQ. c.
the ing and irqaction data structures are defined together. The request_irq function is encapsulated into setup_irq, which uses an interrupt action data structure as an input parameter, insert it into a global irq_desc vector. Irq_des in kernel/IRQ/handler. C is defined and can be reloaded in every different architecture. These architecture files are in arch/XXX/kernel/IRQ. c. Setup_irq is defined in kernel/IRQ/manage. C and can be overloaded by different architectures in arch/XXX/kernle/IRQ. C.

The kernel function that handles interrupts and passes them to drivers is architecture dependent. It is called handle_irq_event on most ubuntures.
The functions used by the kernel to process handles and pass them to the device are related to the architecture. They are called handle_irq_event in many architectures.

Figure 5-2 shows how irqaction instances are stored: There is an instance of irq_desc for each possible IRQ and an instance of irqaction for each successfully registered IRQ handler. the vector of irq_desc instances is called irq_desc as well, and its size is given by the architecture-dependent symbol nr_irqs.
Figure 5-2 shows how the interrupt action instance is stored: here, there is an irq_desc instance on each possible IRQ, in addition, the IRQ processing handle has been successfully registered for each interrupted instance. An instance of the irq_desc vector can be called irq_desc, and its size is determined by the given nr_irqs related to the architecture.

Note that when you have more than one irqaction instance for a given IRQ Number (that is, for a given element of the irq_desc vector ), interrupt sharing is required (each structure must have the sa_shirq flag set ).
It should be noted that when you make multiple IRQ actions on a given IRQ Number (that is, on a given irq_desc vector element ), interruption sharing is required (each structure must have a sa_shirq flag at the same time ).

Figure 5-2. Organization of IRQ handlers
Figure 5-2 IRQ handle Organization

Let's see now what information is stored about IRQ handlers in the fields of an irqaction data structure:
Now let's take a look at how IRQ handle information is stored in the data structure of an interrupt action:

Void (* Handler) (int irq, void * dev_id, struct pt_regs * regs)

Function provided by the device driver to handle configurations of interrupts: whenever the kernel events es an interrupt on line IRQ, it invokes handler. Here are the function's input parameters:
This function is provided by the device driver to handle interrupt notifications: whenever the kernel receives an interrupt on the IRQ line, it calls this handle. Here is the input parameter of the function:

Int IRQ

IRQ Number that generated the notification. Most of the time it is not used by the comment 'device drivers to accomplish their job; the device ID is sufficient.
IRQ Number, used to synthesize (Interrupt) notifications. Most of the time, it is not used by Nic device drivers to complete their tasks. The device ID is enough.

Void * dev_id

Device identifier. The same driver can be responsible for different devices at the same time, so it needs the device ID to process the notification correctly.
Device ID. The same device driver can be responsible for different devices at the same time. Therefore, it requires the device ID to handle the correct interrupt notification.

Struct pt_regs * regs

Structure used to save the content of the processor's registers at the moment the interrupt interrupted the current process. It is normally not used by the interrupt handler.
This structure is used to save the register content of the current processor when the current process is disconnected in the center of the interrupt. Generally, it is not used by the interrupt handle.

Unsigned long flags

Set of flags. The possible values sa_xxx are defined in include/ASM-xxx/signal. H. Here are the main ones from the X86 architecture file:
Flag, which may be defined in the form of sa_xxx in include/ASM-xxx/signal. H. Here are the main values of several x86 architectures:

Sa_shirq

When set, the device driver can handle shared irqs.
When it is set, the device driver can process shared IRQ

Sa_sample_random

When set, the device is making itself available as a source of random events. this can be useful to help the kernel generate random numbers for internal use, and is called contributing to system entropy. this is further described in the later section "initializing the device handling layer: net_dev_init."
When it is set, the device makes itself a source of random events. This can be used to help the kernel generate random numbers used internally and is called the entropy value contributed to the kernel. This will be described in "initialize device handle layer: net_dev_init.

Sa_interrupt

When set, the handler runs with interrupts disabled on the local processor. this shoshould be specified only for handlers that can get done very quickly. see one of the handle_irq_event instances for an example (for instance,/kernel/IRQ/handle. c ).
When it is set, the handle runs in an interrupted form on the local processor. This is only specified when the handle can be processed quickly. For details, see handle_irq_event instance (/kernel/IRQ/handle. C ).

There are other values, but they are either obsolete or used only by particle ubuntures.
There is another value here, but they are not discarded or used by a special architecture.

Void * dev_id

Pointer to the net_device data structure associated with the device. the reason it is declared void * is that records are not the only devices to use irqs. because various device types use different data structures to identify and represent device instances, a generic type declaration is used.
It is used to point to a device-related data structure net_device. The reason for using void * is that Nic is not the only device that uses IRQ.

Struct irqaction * Next

All the devices sharing the same IRQ Number are linked together in a list with this pointer.
All devices that share the same IRQ Number are associated with a linked list. This pointer points to this linked list.

Const char * Name

Device name. You can read it by dumping the contents of/proc/interrupts.
Device name, which can be read by exporting information in/proc/interrupts.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.