CUDA Global Globals Memory variable

Source: Internet
Author: User
The title is a bit of a detour. What I want to say is the global __device__ unsigned char data[64] defined when using Cuda;


When a class of algorithms is parallelized, it is found that it is not necessary to copy data to the GPU every time the loop is initialized, but to copy the data to the GPU at initialization time, so we define a global __device__ variable, and all the calculations are just to save the results of the calculation to data[], but the problem comes. When the calculation is complete, the value in data cannot be copied back from the GPU, the cudamemcpy function return value is 11, the effect is pitch out of bounds. Think about it might be the cudamemcpy function to first use & take the first address, this time if the function copy offset 64, it may not be 64 bytes, which is the difference with unsigned char* data, the latter one per offset is a byte.


The above are their own conjecture, written out for their own encounter similar problems no longer helpless, if someone encountered the same problem can be exchanged.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.