Multi-function PCIe Switch VI: Read and write based on NTB node

Source: Internet
Author: User

Multi-function PCIe switch Six: read-write optimization based on NTB node


1. Features of application based on NTB cross-node reading and writing

NTB is often used in applications where high performance and reliability are required to enable the transmission of data across nodes. For example, as a virtual network card, cross-node data synchronization channel, these occasions are expected to give full play to the NTB PCIe-based high-speed transmission characteristics, maximize system performance.



2. Two implementation modes based on NTB cross-node reading and writing

After you implement address translation and establish a NTB channel, there are two ways to implement NTB data transfer across nodes:

Data transmission based on CPU

Data transmission based on NTBDMA

The former relies on the CPU to move the data, which consumes CPU cycles, but is well suited for multithreaded applications, which rely on independent DMA hardware to carry data with little CPU consumption, but in multithreaded environments additional consideration is required for concurrent access to the DMA hardware. In terms of speed: Without CPU concurrency, the latter is generally much faster than the former. For example, in the author's system, with the CPU to carry data about only 100mb/s bandwidth, and DMA bandwidth close to 1000mb/s, which is not in the case of Dma/pcie to optimize the settings of the situation measured.


3. Common characteristics of two ways of realization

Whether using CPU or DMA to move data across nodes, the underlying is based on PCIE transaction implementations. Writes data from the local to the remote node, which relies on the pciepost write transaction, writes the data from the remote node to the local node, and the underlying is implemented by Pcienon-post read. Depending on the characteristics of the Pciepost transaction and the Non-post transaction, the post operation is generally faster than the non-post operation. In the author's system test data should be proven this theory: the CPU to the other node to write than the CPU from the other node read faster, DMA to the other node to write more than DMA from the other node read faster.


4. Summary

Regardless of the way the cross-node transmission is implemented, it is necessary to see the nature of the underlying PCIE transaction transmission and processing through different PCIe applications, in order to understand the performance differences shown by different applications as a whole and to make tradeoffs and optimizations as needed.


This article is from the "Store Chef" blog, so be sure to keep this source http://xiamachao.blog.51cto.com/10580956/1882433

Multi-function PCIe Switch VI: Read and write based on NTB node

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.