Query of Cuda zero-memory copy

Source: Internet
Author: User

Today, I thought about the Cuda zero-memory copy problem. I felt that it would be useful in the program to be designed, so I checked the relevant information.

The following are some helpful links:

Zero copy usage in Cuda -- for two-dimensional pointers

Zero-copy usage in Cuda -- for one-dimensional pointers

Cuda zero copy usage-two-dimensional struct pointer

Discussion on Cuda zero-copy memory

After investigation, it is found that the zero copy technology is suitable for centralized computing and less memory copy times. Such as vector dot product and sum calculation.

Since the zero copy technology opens up memory space on the CPU and the GPU can directly access the space, I have a question: "If the space opened on the CPU exceeds the available space of the GPU, will the GPU memory overflow occur?"

Specifically:

Assuming that the GPU memory is 1 GB, I use 999 MB, and the idle memory is only 1 MB, but the space opened on the CPU is 10 MB, and the GPU is required to perform operations, will the graphics memory on the GPU overflow at this time?

After some investigation, the conclusion is that it will not overflow.

On the csdn Forum, someone asked: "Is the GPU graphics card memory large enough in the shooting process? Don't you think about it ?"

Someone replied: "You can apply for memory larger than the GPU memory, as long as the host memory is large enough ~",

Also, "you can write a program to implement it by yourself. Use the API mentioned above to apply for a memory space that exceeds the GPU memory, and then get the pointer of the device to perform operations, my GPU memory is 6 GB and the memory is 32 GB. If I applied for a 16 GB space in the experiment, I can apply for it and the result of the kernel operation is correct ".

So far, this question is answered. The conclusion is that the zero copy technology opens up the complete memory on the host, while the GPU is used for reading and operating from the GPU, rather than reading the whole block.

P.s. some people say that this problem exists: "zerocopy does not seem to support complex operations, and make_float4 () does not support it. I will see an error when I use it." It will be verified in future use, I don't know if the same problem will occur in later Cuda versions.

Query of Cuda zero-memory copy

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.