PCI device driver development
1. Introduction to PCI
The PCI bus standard is a bus standard that connects external devices of the system. It is the most important bus in the PC and is actually the interface for interaction between various parts of the system. The transmission rate can reach 133 Mb/s. In the current PC architecture, almost all external devices use a variety of interface bus, which are connected to the PCI system through a bridge circuit. In this PCI system, the host/PCI bridge is called the North Bridge, connecting the master processor bus to the basic PCI local bus. The interfaces between PCI and other bus are called nanqiao, where nanqiao usually contains interrupt controller, IDE controller, USB controller and DMA controller. The South Bridge and the North Bridge form the Motherboard chipset.
2. PCI configuration space
Each PCI device has its own configuration space to support plug-and-play, so that it meets the current system configuration structure. This section briefly introduces the PCI configuration space.
The configuration space is an address space with a capacity of 256 bytes and a specific structure. This space is divided into the header area and the device area. The header area is 64 bytes in length, and each device must be configured with a register. Each field in this area is used to uniquely identify a device. The remaining 192 bytes vary by device. The usage of 64 bytes in the header area of the configuration space is 1. To implement plug-and-play, the system can allocate new resources to PCI devices based on the usage of hardware resources. Therefore, the focus of writing a device driver is to obtain the content of the base address and interrupt trunk register. The configuration space consists of six base address registers and one interrupt trunk register. The specific usage is as follows: PCI base address 0 register: The system uses this register to allocate a PCI address space for the configuration register of the PCI interface chip, through this address, we can access the configuration registers of the PCI interface chip in the form of memory ing.
PCI base address 1 register: The system uses this register to allocate a PCI address space for the configuration register of the PCI interface chip, through this address, we can access the configuration registers of the PCI interface chip in the form of I/O.
PCI base address 2, 3, 4, and 5 registers: The system BIOS uses these registers to allocate PCI address space to support access to the local configuration registers 0, 1, 2, and 3 of the PCI interface chip.
In all base address registers, 0th bits are read-only bits, indicating whether the address is mapped to the memory space or the I/O space. If it is "1", it indicates that it is mapped to the I/O space, if it is "0", it indicates that it is mapped to the memory space.
Interrupt line: used to describe the connection status of a disconnection. The value of this register is equal to the standard 8259 IRQ Number (0 ~ 15.
Table 1 PCI configuration space
3. device Initialization
PCI device drivers need to identify PCI devices, find PCI hardware resources, and interrupt services for PCI devices. During driver initialization, The halgetbusdata () function is used to find the PCI device. During initialization, the device ID and vendor ID are used to traverse all devices on the bus and find the specified PCI device, obtain the bus number, device number, and function number of the device. With this configuration information, you can address the resource configuration list of the device in the system.
Then, the driver needs to obtain the hardware parameters from the configuration space. The interrupt Number of the PCI device, the port address range (I/O) mode, and the memory address and ing mode can all be obtained from the data structure of the hardware resource list. In Windows NT, call the halassignslotresources () function to obtain the data structure pointer of the resource list of the specified device, and then traverse all resource descriptors in the list, obtain the device's I/O port base address and length, interrupt level, interrupt vector and mode, memory base address and length, and other hardware resource data.
The DMA Communication we designed uses the bus master mode for communication. During device initialization, We need to initialize the DMA adapter and use halgetadapter () to obtain the adapter Object Pointer allocated by the operating system.
The sample code is as follows:
// Traverse the bus to obtain the bus number, device number, and function number of the specified device
For (busnumber = 0; busnumber <max_pci_buses; busnumber ++ ){
For (devicenumber = 0; devicenumber <pci_max_devices; devicenumber ++ ){
Slotnumber. U. Bits. devicenumber = devicenumber;
For (functionnumber = 0; functionnumber <pci_max_function; functionnumber ++ ){
Slotnumber. U. Bits. functionnumber = functionnumber;
If (! Halgetbusdata (pciconfiguration, busnumber, slotnumber. U. asulong,
& Pcidata, sizeof (ulong ))){
Devicenumber = pci_max_devices;
Break;
}
If (pcidata. vendorid = pci_invalid_vendorid ){
Continue;
}
If (vendorid! = Pci_invalid_vendorid )&&
(Pcidata. vendorid! = Vendorid | pcidata. DeviceID! = DeviceID )){
Continue;
}
Ppcidevicelocation-> busnumber = busnumber;
Ppcidevicelocation-> slotnumber = slotnumber;
Ppcidevicelocation = & pcidevicelist-> list [++ count];
Status = STATUS_SUCCESS;
}
}
}
// Obtain the device resource List Data Pointer
Status = halassignslotresources (registrypath,
& Pdevext-> classunicodestring,
Driverobject,
Deviceobject,
Pdevext-> interfacetype,
Pdevext-> busnumber,
Pdevext-> slotnumber,
& Pcmresourcelist );
4. I/O port access
On a PC, the I/O addressing method is different from the memory addressing method, so the processing method is also different. The I/O space is a 64 kB addressing space. I/O addressing is not divided into real mode and protection mode, and the addressing mode is the same in various modes. In Windows NT, the system does not allow ring3 user programs and user-mode drivers to directly use the I/O command to access the I/O port, any operations on I/O must be completed using the kernel mode driver. When accessing the I/O port, read and write using the read_port_xxx and write_port_xxx functions. The I/O port base address uses the I/O port base address returned from the configuration space base address register PCI base address 1.
The sample code is as follows:
Regvalue = read_port_ulong (pbaseaddr + regoffset );
Write_port_ulong (pbaseaddr + regoffset, regvalue );
5. device memory access
Winsows works in 32-bit protection mode. The fundamental difference between the protection mode and the actual mode lies in the difference in the CPU addressing mode. This is also a problem that needs to be solved in the Windows Driver Design. Windows uses the segmentation and paging mechanisms to make it easy for a program to run on computers with different physical memory capacities and different configuration ranges, programmers can use virtual memory to write programs that are much larger than any physical memory actually configured. Each virtual address consists of a 16-bit segment selection word and a 32-bit segment offset. Through the segmentation mechanism, the system generates linear addresses from virtual addresses. Then, a linear address generates a physical address through the paging mechanism. Linear addresses are divided into three parts: page Directory, page table, and page offset. When a new Win32 process is created, the operating system allocates a piece of memory for it and creates its own page Directory and page table, the address of the page Directory is also included in the process's on-site information. When calculating an address, the system first reads the address of the page directory from the CPU controller, and then obtains the address of the page table based on the page Directory, obtain the page frame of the actual code/Data Page Based on the page table, and then access a specific unit based on the page offset. Hardware devices read and write physical memory, but applications read and write virtual addresses, so there is a problem of ing physical memory addresses to user program linear addresses.
The transformation from physical memory to linear address is the work that needs to be done by the driver. It can be done by initializing the driver. After obtaining the basic address of the storage device, call the haltranslatebusaddress () function to convert the memory address related to the bus to the physical address of the system, and then call mmmapiospace () the function maps the physical address of the system to the linear address space. To access the device memory, call the read_register_xxx () and write_register_xxx () functions. The base address uses the previously mapped linear address. When the device is detached, call mmunmapiospace () to disconnect the ing between the device memory and linear address space.
The sample code is as follows:
Haltranslatebusaddress (interfacetype,
Busnumber,
Baseaddress-> rangestart,
& Addressspace,
& Cardaddress)
Baseaddress-> mappedrangestart = mmmapiospace (cardaddress,
Baseaddress-> rangelength,
Mmcached );
......
Regvalue = read_register_ulong (pregister );
Write_register_ulong (pregister, pinbuf-> regvalue );
......
Mmunmapiospace (pbaseaddress-> mappedrangestart, pbaseaddress-> rangelength );
6. interrupt handling
The interrupt settings, responses, and calls are completed in the driver. The interrupt setting should be completed when the device is created. Call halgetinterruptvector () to convert the bus-related interrupt vector parameters to the system interrupt vector using the parameters extracted from the cmresourcetypeinterrupt descriptor, call ioconnectinterrupt () to specify the interrupt service and register the function pointer of the interrupt service function ISR (interrupt service routine.
When a hardware device is interrupted, the system automatically calls the ISR function to respond to the interruption. ISR functions have a high interrupt request level. They are mainly used to clear hardware device interruptions and are not suitable for executing too much code. The delay process call (DPC) mechanism must be used to transmit large data blocks. For example, when you use a PCI device for DMA Communication, you can use the ISR function to determine and clear the interruption of the specified device. before exiting the ISR, you can call the DPC function. In the DPC function, completes the DMA Communication Process and returns the data to the user program.
The sample code is as follows:
Deviceextension-> interruptlevel = partialdata-> U. Interrupt. level;
Deviceextension-> interruptvector = partialdata-> U. Interrupt. vector;
Deviceextension-> interruptaffinity = partialdata-> U. Interrupt. affinity;
If (partialdata-> flags & cm_resource_interrupt_latched)
{
Deviceextension-> interruptmode = latched;
} Else {
Deviceextension-> interruptmode = levelsensitive;
}
......
Vector = halgetinterruptvector (pdevext-> interfacetype,
Pdevext-> busnumber,
Pdevext-> interruptlevel,
Pdevext-> interruptvector,
& IRQL,
& Affinity );
Status = ioconnectinterrupt (& pdevext-> interruptobject,
(Pkservice_routine) pcidmaisr,
Deviceobject,
Null,
Vector,
IRQL,
IRQL,
Pdevext-> interruptmode,
True,
Affinity,
False );
7. DMA Communication Process
DMA Communication is implemented in the driver. Multiple routines are required to complete one DMA Communication.
1) DriverEntry routine
Construct the device_description structure, call halgetadapter, find the adapter object associated with the device, and save the address of the returned adapter object and the number of ing registers in the extended data structure of the device.
Sample Code:
// Adapter object for applying for DMA
Devicedescription. Version = device_description_version;
Devicedescription. Master = true;
Devicedescription. scattergather = pdevext-> scattergather;
Devicedescription. demandmode = false;
Devicedescription. autoinitialize = false;
Devicedescription. dma32bitaddresses = true;
Devicedescription. busnumber = pdevext-> busnumber;
Devicedescription. interfacetype = pdevext-> interfacetype;
Devicedescription. maximumlength = pdevext-> maxtransferlength;
Pdevext-> adapterobject = halgetadapter (& devicedescription,
& Numberofmapregisters
);
......
2) start I/O routine
This routine requests the ownership of the adapter object, and then leaves the rest of the work to the adaptercontrol callback routine.
A) Call keflushiobuffers to clear data from the CPU cache to the physical memory, then calculate the number of ing registers and the size of the user buffer, and the number of bytes transmitted in the first device operation.
B) Call mmgetmdlvirtualaddress to restore the virtual address of the user buffer from MDL and store it in the extended data structure of the device.
C) Call ioallocateadapterchannel to request the ownership of the adapter object. If the call is successful, the adaptercontrol routine performs the remaining settings. If the call fails, the next IRP packet is processed and the next IRP is processed.
3) adaptercontrol routine
This routine initializes the DMA controller and starts the device.
A) Call iomaptransfer to load the adapter ing register of the adapter object.
B) send appropriate commands to the device to start the transmission operation.
C) The returned value keepobject retains the ownership of the adapter object.
4) interrupt service (ISR) Routine
The system calls the device when the device is interrupted.
A) sends an interrupt response command to the hardware device.
B) Call iorequestdpc to continue processing the request in the driver's dpcforisr.
C) Return true, indicating that the service has been interrupted.
5) dpcforisr routine
ISRs are triggered at the end of each data transmission operation to complete the current IRP request.
A) Call ioflushadapterbuffers to clear any residual data in the cache of the adapter object.
B) Call iofreemapregisters to release the used ing register.
C) Check for the remaining data that has not been passed. If yes, calculate the number of bytes to be transferred in the next operation on the device. Call iomaptransfer to reset the ing register and start the device; if no data is available, the current IRP request is completed and the next request starts.