Multi-function PCIe switch three: Data migration and its cache consistency

Source: Internet
Author: User

About PCIe non-transparent bridge cache consistency
The PCIe non-transparent bridge provides two mechanisms for migrating data from local node to remote node, respectively, based on address mapping and embedded

Dma. For remote nodes, the CPU may not be aware when it accepts data, so cache consistency needs to be ensured;

On the local node, when the data is transferred to its own memory via DMA, the CPU is not notified, so the cache should be considered

Consistency.

Different platforms implement the cache consistency mechanism, the ARM platform requires software participation, and the INTELX86 platform hardware can automatically dimension

Cache consistency. x86 provides different levels of cache consistency, and some special applications may require a custom cache-consistent

Management strategy.


1. Intel X86 Cache Conformance level
Intel defines different levels of conformance based on different application requirements:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/77/A0/wKiom1ZqX2byK1ZoAAL7mKcRJjg553.png "title=" cache_ Level.png "alt=" Wkiom1zqx2byk1zoaal7mkcrjjg553.png "/>

2. Three levels of Intel X86 cache management

    • Intel provides different granularity to manage cache consistency: The CD/NW bit of the CR0 register in the processor core: Enable or disable the cache for the entire system;

    • CR3 PCD/PWT bit and PCD/PWT property bit for page table and page catalog table entries: Controls all page tables, the cache properties of a specific page/page table, respectively;

    • MTRR (Memory Type Rang Register): Specifies whether the address of a range is cache or Uncache


3. Cache Consistency and DMA

DMA Buffer and DMA memory

DMB Buffer: is a part of physical memory that holds data from DMA or data to be sent to DMA
DMA Memory: A piece of storage space on a physical peripheral, such as a separate memory on the video card, or IO space, PCI memory space


Coherence DMA and streaming DMA

The function of DMA is to move data between DMA buffer and DMA memory, and the consistency requirement is guaranteed: when it is necessary to read from DMA

The data that is received is the latest data, and the data to be sent to the DMA must be up to date when the DMA is written.

Coherence DMA: If the DMA buffer corresponds to the physical memory contiguous, consists of a continuous physical page, as long as the DMA transfer

Data length allows DMA operations to pass data to these contiguous physical pages. The advantage of it is that fast, insufficient is the need to find

To a contiguous block of physical memory pages. In addition, it's good to ask for cache consistency. Kernel-provided dma_malloc_coherence ()

Function can do this two points, for x86, because the hardware has ensured the cache consistency of the DMA buffer, only need to find things

It is good to have a contiguous page block of addresses. If the hardware does not guarantee cache consistency, these physical addresses are required to be uncached.

Streaming DMA: If the virtual address of the DMA buffer is contiguous, but the physical address is not determined to be contiguous, a DMA

Descriptor to the physical address of the continuous requirements, you need to find the virtual address corresponding to all the physical page box, each with DMA transmission. It's good

There is no limit to the address, and the driver and the kernel shield the details of the Split physical page box, which is called along with the pass. In this mode, the cache consistency

Protection is dependent on the transmission direction:

From memory to DMA: To write the cache contents of each page box back to memory

From DMA to memory: to invalidate the cache for each page box, ensure that the subsequent access points to the memory

For X86, the hardware has implemented a mechanism for managing cache consistency, and the above cache writeback and invalidation work is not required.

4. PCIe non-transparent bridge cache consistency considerations

Regardless of special circumstances, according to the above analysis, in the case of data transmission using DMA, the local DMA buffer

The cache consistency is always guaranteed due to the address translation of the non-transparent bridge of the PCIe Opaque bridge, in the actual application scenario, the local

The DMA Memroy actually maps to the remote node's local DMA buffer, so its cache consistency is x86 hardware

of protection. Of course, if you consider the support for non-volatile storage of the PCIe opaque bridge, the requirement to prevent data loss, in addition to ensuring

Cache consistency, also requires:

    1. All write accesses go directly to the memory;

    2. All reads are also from memory (allows reading from the cache in the context of performance considerations)


Resources:
1.intel,ia32_dev_3a.pdf
2. Chen Cossong, "Deep Linux kernel device driving mechanism"


This article from "Storage Chef" blog, reproduced please contact the author!

Multi-function PCIe switch three: Data migration and its cache consistency

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.