Intel64 and IA-32 Architecture Optimization guide-3.9 maximum PCIe Performance

Source: Internet
Author: User

3.9 maximum PCIe Performance

PCIe performance has an unexpected impact on the size and alignment of upstream read/write transactions from a PCIe proxy to the publishing of host-side memory. As a general rule, in terms of bandwidth and latency, the best performance is to align the starting address of upstream read/write at the 64-byte boundary and ensure that the request size is a multiple of 64 bytes, when a greater multiple (128,192,256 bytes) is introduced, the bandwidth is appropriately increased to obtain the bandwidth. In particular, a partial write operation may delay subsequent requests (read or write.

The second rule is to avoid multiple concurrent pending accesses to a single cache row. This will lead to a conflict, which in turn will lead to the serialization of access, and originally can be streamlined, which leads to higher latency and/or lower bandwidth.

The pattern against this rule includes serial access (read or write) that is not a 64-byte multiple and explicit access to the same cache row address. These overlapping requests with different starting addresses but with the length of the request that leads to request overlapping will have the same effect. For example, a 96-byte read to the address 0x00000200 is followed by a 64-byte read to the address 0x00000240. [Note: from the address 0x00000240 to the address 0x00000260, It is the overlapping part of the two reads] it will lead to a conflict-and may be a delay-for the second read.

An upstream write with a 64-byte multiple but no alignment will have a series of partial and complete sequential write performance. For example, for a write with a length of 128 bytes to the address 0x00000070, the write operation will be similar to three serial writes with a length of 16, 64, and 48, respectively, and then written to the address 0x00000070, respectively, 0x00000080 and 0x00000100.

For PCIe cards that implement multi-function devices, such as dual-port or four-port network interface cards (NICS) or dual GPU graphics cards, it is important to note the non-optimization behavior through one of those devices that will affect the bandwidth and/or latency observed by other devices on the card. As described in this section, all traffic on a given PCIe port is treated as initiated from a single device and function.

For the best PCIe bandwidth:

1. Align the starting address of upstream read/write at the 64-byte boundary.

2. read/write requests that are multiples of 64 bytes.

3. Eliminate or avoid serial and random upstream write of some rows.

4. Eliminate or avoid conflicting upstream reads, including some serial row reads.

Technologies that prevent performance defects include alignment of all descriptors and data caches in the cache line, filling in the upstream written descriptors to 64-byte alignment, and caching the incoming data to achieve greater upstream write load, the data structure is allocated to allow serial reads of 64 bytes (multiples of) through the PCIe device in this way. The negative impact of unoptimized reads and writes depends on the microarchitecture based on specific loads and products.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.