Summary of PowerPC PCI-E debugging

Source: Internet
Author: User

Recently, a FPGA was added to the PowerPC board, and the PCI-e was debugged when a PCI-E was used to connect the board. Since VxWorks itself has already written the driver, it can be called directly during use, but soon the problem arises: at first, the MMAP method was directly used to map the FPGA bus to the application's memory space and try to read and write. This method is relatively simple and does not need to be copied between the kernel and the application, efficiency should also be good. However, the actual test speed is quite slow, the reason for online search, only found that the PCI-E default transmission each time can only pass four bytes of the package !! To transfer large packets to improve efficiency, you must use DMA transmission. Otherwise, you can only send four bytes at a time.

Therefore, the DMA mode is better understood, the PCI-E has a DMA descriptor, the content is similar to the traditional DMA, but a DMA transmission chain is configured, after the change, the speed has greatly improved.

The following are some of the tangle-related concepts seen during the debugging process, which are listed below to deepen your understanding. I am a newbie, so most of the content is rectified Based on the online materials.

Msi interruption and traditional interruption

The PCI-E supports two types of interruptions: Traditional intx interrupt and MSI interrupt. Comparing the two kinds of interruptions allows us to understand the ins and outs of the Development of PCI specifications, and also enables us to grasp the technical direction of PCI development.

Msi, short for message signaled interrupt (MSI), is a PCI-E device that writes a specific message to a specific address, triggering a CPU interruption. MSI has three main advantages over traditional interrupt:

1. Performance Loss Caused by shared interruptions:

Traditional interrupt pins are often shared by multiple devices. when an interrupt is triggered, the kernel must trigger the corresponding Interrupt Processing for each device in sequence, which will inevitably compromise the overall system performance. each MSI interrupt is unique to the device, so there is no performance loss caused by the shared interrupt.

2. Traditional interruptions are generated ahead of schedule, and the actual data is not truly achieved:

Generally, a device sends data to the device, and sends an interrupt notification to the CPU for processing. However, this seemingly simple application may also cause problems if it is a traditional interrupt, that is, the interrupt has been generated, and the data has been sent from the device, but it has not actually reached the primary storage. At this time, the CPU cannot read the data it wants. In this case, the CPU must read the register from the device to check whether the data has actually reached the destination. The PCI transaction sorting rule ensures that the Register is updated only before the data actually arrives. This is another drawback of traditional interrupt. Because MSI interrupt shares the same path with data packets and has a strict sequence, this problem does not occur, and it does not need to update or query device registers, this saves a lot of expenses.

3. Each device has up to four traditional interrupt pins:

For a multi-functional PCI device, each function has at most one interrupt pin. The device driver must query the specific events generated by the device, which will inevitably reduce the interrupt processing speed. A device can support up to 32 MSI interruptions, each of which has its specific functions, such, some situations and error handling have their own interruptions which make the driver process more effective, such as data sending and receiving interruptions.

 

Physical address and BUS address

These two concepts mainly appear in DMA transmission, and the difference between them is mainly after the introduction of iommu. During the DMA Operation, the physical address accessed by the CPU is converted by iommu. The converted address is called the BUS address. If there is no iommu, the physical address and the BUS address are equal.

Bus addresses are used between the peripheral bus and memory. Generally they are the same as the physical addresses used by the processor (for example, x86 Platform), but this is not necessary. some computer architectures provide I/O Memory Management Units (iommu), which implement the re- ing between the bus and the main memory.

On the PowerPC platform, when the Linux kernel function dma_alloc_coherent is used to allocate the DMA memory, the function returns the BUS address for the DMA and kernel virtual address for the CPU at the same time. It can be said that this function encapsulates the ing between the BUS address and the kernel virtual address to facilitate DMA programming during driver development.

 

Kernel logical address and kernel virtual address

The virtual address is the address that the CPU core accesses through the MMU page table. Its address range is 3G-4g (no special settings), which corresponds to the user-state User Virtual Address 0-3G. Within the range of 3g-4g, there is a subset of 3G-3G + main_memory_size, and the virtual address space of the primary memory size, because the MMU page table ing uses a flat linear ing, there is a special name in LDD: Kernel logic address.

In the kernel code, you can use a simple offset (3 GB) to obtain the corresponding physical address, and use the _ pH () macro in the kernel to complete this function.

For the kernel virtual address, the physical address is obtained through ipv_to_phy.

 

PCI-E outbound and inbound

When the PCIe device and the system memory access each other, Outbound refers to the CPU to the device direction; Inbound refers to the device --> RC (CPU end) Direction. In terms of concept, devices are all external devices. When the CPU reads and writes RC registers, it is still within the range of on-chip systems, so it is neither inbound nor outbound.

Simply put, if the CPU reads and writes the BUS address of the PCI bar, it is outbound. If the device reads and writes the main memory of the CPU, It is inbound.

 

Local Memory Map

Local Memory Map refers to the 36-bit address space that the CPU sees when accessing memory and IO space. Even if the memory or I/O space is physically located on the remote device, the memory or I/O space is still local after memory map. Therefore, the "local memory map" is a simple and concise term, but it is very admirable.
The ing of the local memory map is defined by law (local access window) in PowerPC.

Local access window
There are 10 local access windows in the mpc83xx system. Each window map a memory region to a specific target controller, such as elbc, DDR, and PCI Controller. Here I am concerned with the PCI and PCIe controllers.

 

Hardware Interrupt number and software interrupt number

This article has already been discussed earlier. For more information, see the blog: Linux PowerPC interruption principle.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.