Summarize your GPU-based heterogeneous Program Development Process

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The characteristics of heterogeneous program development determine whether development is different from traditional development methods. For this project, this chapter lists several important points worth noting and guides the entire process from the development process to ensure program quality + and optimization. The entire development process is briefly described for the development of heterogeneous programs and your own Development Business.

The process is described as follows:

Process 1: Data Preparation

Prepare the raw data of the business to be processed. For example, if your data source is MySQL, app, MongoDB, or other, it is usually used for testing, I will write a function that produces floating point numbers immediately to simulate my project.

Process 2: Business Logic Design

For more business-required functions, the design of business-layer classes and portfolio classes generally have four functions, each of which is directly dependent. The software product generated in this process is a class diagram.

Process 3: business logic implementation

It refers to the interface implemented in the CPU and can be called by other apps. I suggest encapsulating the parallel and non-parallel transaction logic in this service class, if there is a parallel processing module, it will be processed in the next software process. The software products generated in this process are. h and. cpp of the class. I always remind myself that I am not eager to write the kernel program of the parallel module.

Process 4: Data Dictionary Design

Why is it wrong to put this process in this place, because the data is stored in the database from the database, and finally the computed data is stored in the database. This entire process involves things, it should not be placed in this place. As shown in the figure, the data dictionary always runs through.

However, this process makes some sense, because a data block is put into a GPU for parallel computing and needs to be copied from the device, a good data type, it is of great significance for the bandwidth and memory used by devices and hosts. Simply put, no one would like to copy a group of string strings that are meaningless to the GPU and use them only as the IDs that indicate a computing result, right. Therefore, data dictionary design is also an iterative process. The data dictionary found during development can be optimized as much as possible!

The principle of data dictionary design is that the GPU of devices is the service object, and the principle of devices is favored.

What are the important points? The design of data dictionaries is very important in heterogeneous development. We do not seek to be in place in one step, but to improve.

Process 5: Kernel Program Design

The kernel program is a parallel computing program developed on the GPU.

In process 3, if a function module finds that the parallel granularity is large, we need to start to do really meaningful things.

To emphasize the clarity of the program architecture --

We will first establish. the cuh file declares the functional modules for Parallel Computing. Note that the business functions in process 3 only need to include this. the cuh file can call the encapsulated parallel computing module.

Next, create the. Cu file. Note that all kernel operation symbols must be implemented in the Cu file. We implement kernel functions in the Cu file to process parallel data.

Yes, we should not write too many header files or more header files in the kernel program. It is very helpful for the program architecture and engineering!

Process 6: Iterative Optimization

There are two optimizations:

First, we should never forget whether the business can be further optimized;

Second, we are most looking forward to kernel program algorithm optimization.

Maybe the second one is that we will have more challenges. In a very simple example, my sorting algorithm is higher and faster than your sorting algorithm; or, my program uses less memory than your program and has a large amount of instructions. When developing a kernel program, we should try not to waste the kernel resources, and eliminate the possibility of memory out-of-bounds and exposure. The consequence is not a software crash, but a blue screen of the system!

In the last article, there is no addition in the figure, that is, every version, we store records and analyze the efficiency, as a phased product of our own optimization program. I tried to finalize each version from CPU to GPU, so that the entire process would be interesting. I can see that each version is being upgraded and will be very proud. Yes, originally --

I want to remind myself that heterogeneous programs, step by step, one version, one by one comparison, stable and improved efficiency and quality!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Summarize your GPU-based heterogeneous Program Development Process

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support