NTB Debugging FAQs Guide
As an important device to realize the data transmission in different PCI domains and even across nodes, NTB plays an important role in the field of server and storage to realize double control and memory exchange. As it itself as a virtualport appear, but also can be connected to the node through Pciscan see, as a linkport appear, coupled with its implementation of the address conversion and forwarding functions, in the actual project, will inevitably encounter various problems. This paper, combined with the author's recent work, shares the common problems and solutions in NTB debugging process.
From the point of view of the problem, specific FAQs include:
Unable to find NTB device;
Ntbmailbox cannot transmit data;
The reqid cannot be detected;
Ntbbar size is not large enough;
Data transfer Error
Depending on the PCIe-related hardware and software levels where the problem occurs, these problems can be summed up in the following categories:
Hardware failure;
Firmware failure;
PCIE Setup Error;
Program error.
The following are listed in the above several phenomena, analyzed and discussed each:
NTB device not found
In this case, when you run the application, you may find that the device is not found in the library with the knife, and the program fails or exits. At this point, you can first see through the LSPCI to scan to the NTB device, if not found in the system did not find NTB hardware, at this time need to check whether NTB EEPROM has been enabled NTB, and whether there is DISABLE/ENABLENTB jumper on the board, If there is, it is also necessary to insist whether it has been disable. If the device exists and can be scanned by LSPCI, but the application is prompted not to see the device, you need to check that the device driver is loaded successfully. At this point, you can fix it by reloading the NTB device driver.
2. Ntbmailbox register cannot transmit data
In general, NTB's mailbox and doorbell registers are used to pass information between multiple nodes to achieve upper-level synchronization, according to the NTB usage instructions. If the number of Doorbell/mailbox registers read back is 0xFFFFFFFF, then you need to check that the BAR0/1 setting of the mapped Doorbell/mailbox register is correct. The method is to read the value of BAR0/1 by LSPCI and check if it matches the physical address assigned to it by the BIOS.
3. Reqid cannot detect
The specific phenomenon is indicated by the following output:
Communicatingfrom:virtual side
Determinent Connect Type:standard (NTV <---> NTL)
Getbar 2 Properties:ok (size:2048 KB)
MapBAR 2 to User Space:ok (va:0x7f5c1801d000)
Probefor write ReqID:ERROR:Unable to probe ReqID, Auto-add 0,0,0
Addwrite Req ID to LUT:ERROR:Unable to add LUT entry
Allocatepci Buffer:ok (pci:3638a000 size:1000 B)
Mappci Buffer:ok (va:0x7f5c18d01000)
Reqid is used to record the device that issued the PCIETLP request B:d:f, if it is a CPU-initiated access, then it is usually used in North Bridge Rootcmplex B:d:f to express, if it is DMA sent up the access, Then it should be represented by the b:d:f of the DMA that initiated the access. In the application, the b:d:f can be extracted by starting a special TLP and then extracting its reqid according to the message protocol. In the event that such reqid cannot be detected, it is necessary to check if the BAR2/BAR3 or BAR4/BAR5 's base address register is set correctly, and the method of checking it is to determine whether the value of the base address register of the bar matches the address assigned by the BIOS.
4. The barsize used in the address conversion is not large enough
Limited to BIOS and EEPROM settings, the barsize used as the address translation is fixed, and the address window may be too small for applications that implement full-system memory sharing or large addresses to each other. To do this, you need to increase the address.
First of all, this requires the BIOS to the PCI device allocation address space, can support a large enough space range, to ensure that some bios in the relevant settings have been enabled, in the hands of the BIOS for example, it needs to enable more than 56T PCI address space, as indicated:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/83/80/wKioL1d0wg-ChMfVAAQcHZ20Sms370.png "title=" High_ Mmio_size_setting.png "alt=" Wkiol1d0wg-chmfvaaqchz20sms370.png "/>
Second, you also need to modify the value of the Setup register for the bar that is used as the address translation, which requires finding the manual and setting a sufficiently large address space based on the settings of the bitmap and mask in the register. It is important to note that this address cannot exceed the maximum address space that the BIOS can support, or it may cause the system to hang when it pciemulate because it cannot allocate enough address space. If the window cannot be expanded on a set of address conversion registers, you can try a different address window. For example, the bar2/bar3 on the hand of the window size of only 1 m, but by observing the output of/PROC/IOMEM, you can see the Bar4/bar5 window full of 8 g:
380000000000-383fffffffff:pci Bus 0000:00
383c00000000-383e001fffff:pci Bus 0000:04
383c00000000-383e001fffff:pci Bus 0000:05
383c00000000-383e001fffff:pci Bus 0000:06
383c00000000-383dffffffff:0000:06:00.0
383e00000000-383e000fffff:0000:06:00.0
After loading the corresponding NTB driver, you can see this large window as well:
[86764.073933] Lpc6500_nt:resource 01
[86764.073935] LPC6500_NT:Type:Memory
[86764.074004] Lpc6500_nt:pci BAR 2:383e0000000c
[86764.074006] Lpc6500_nt:phys addr:383e00000000
[86764.074008] lpc6500_nt:size:200000 (2048 KB)
[86764.074010] LPC6500_NT:Property:Prefetchable 64-bit
[86764.074206] Lpc6500_nt:kernel va:ffffc90017700000
[86764.074208] Lpc6500_nt:resource 02
[86764.074209] LPC6500_NT:Type:Memory
[86764.074279] Lpc6500_nt:pci BAR 4:383c0000000c
[86764.074281] Lpc6500_nt:phys addr:383c00000000
[86764.074283] lpc6500_nt:size:200000000 (8388608 KB)
[86764.074285] LPC6500_NT:Property:Prefetchable 64-bit
[86764.487186] Lpc6500_nt:kernel va:ffffc90017e81000
[86764.487189] Lpc6500_nt:using PCI BAR 0 (va=ffffc90016c80000) ==> PLX regs
According to the above analysis can be seen, NTB debugging process, may encounter a variety of strange problems, but original aim, as long as grasp the NTB address conversion and data transmission principle, it is not difficult to analyze the root of the problem by layer, find the corresponding solution.
This article from "Storage Chef" blog, reproduced please contact the author!
NTB Debugging FAQs Guide