Pageable VS Pinned
The memory we usually use is Pageable, And the other mode is Pinned (Page-locked ), in essence, the system is forced to apply for and release the memory in the physical memory, and does not participate in page switching, thus improving the system efficiency. cudaHostAlloc and cudaFreeHost must be used to allocate and release the memory.
Advantages
1. higher bandwidth
2. kernel processing and memory copy can be performed simultaneously
3. Memory ing)
Disadvantages
Using pinned reduces the available system memory usage and affects system performance.
In Pinned mode, the following attributes are available:
Write-combining
By default, pinned has the cacheable attribute, which can be replaced by the cudaHostAllocWriteCombined flag.
Advantages
1. You can release L1 and L2 resources and apply the cache to other places.
2. write-combining will not be detected when transmitted over the PCIE bus, which may increase
40% Performance
Disadvantages
Cannot read or read very slowly.
Portable
For multithreading, only pinned threads can be allocated. To share other threads, The cudaHostAllocPortable flag is required.
Mapped
Pass in the cudaHostAllocMapped ID to map the host to the device memory (supported by some devices). In this way, the device and the host share a memory. In the kernel function, you can use cudaHostGetDevicePointer to obtain the pointer, different host threads get different pointers.
Advantages
You do not need to allocate or copy memory on the device. Data is implicitly transmitted by the kernel when necessary.
Stream is not required for asynchronous execution. kernel functions are automatically transmitted asynchronously at the same time.
Disadvantages
Due to Memory Sharing, memory access must be synchronized