OPENCV Study Notes (44)--on the GPU

Last Update:2018-07-20 Source: Internet

Author: User

Tags cuda toolkit

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Long time no update, I feel that there is no special harvest is worth sharing with you, or some lazy, TLD ended did not write a blog to summarize. Or to share with you a OPENCV of a few people touch the module bar--gpu. This part of my contact is also very few, just according to the tutorial and everyone simple communication, if there is a master has the use of experience, welcome a lot of criticism.

OPENCV's GPU module only supports NVIDIA graphics, because it is based on Nvidia's Cuda and Nvidia's NPP modules. The advantage of this module is that the use of GPU modules eliminates the need to install Cuda tools and to learn GPU programming, because there is no GPU-related code to write. But if you want to recompile the OPENCV GPU module, you still need CUDA toolkit.

Due to the development of GPU modules, most of the functions are very similar to those used in the CPU development. The first is to link the GPU module to your project and include the necessary header file GPU.HPP. Second, the data structure under the GPU module, originally in the CV namespace, is now in the GPU namespace and can be used with GPUs:: and CV:: To prevent confusion.

Again, in the GPU module, the name of the matrix is Gpumat, not the previous mat, the other function name is the same as the CPU module, the difference is that the current parameter input is no longer the mat, but the Gpumat.

Another problem is that for 2.0 of GPU modules, multi-channel functions are not well supported, and GPU modules are recommended for processing grayscale images. In some cases, the GPU module is not running as fast as the CPU module, so it is considered that the GPU module is relatively immature and needs to be further optimized. One important reason is that the memory Management section and the Data Transformation section consume a lot of time for GPU modules.

It is important to note that before all functions that use the GPU module, it is best to call function Gpu::getcudaenableddevicecount, which returns a value of 0 if you do not support the GPU when using the OpenCV module compile. Otherwise, the return value is the number of installed Cuda devices.

Another thing is the use of GPU modules, you need to compile OpenCV with CMake to make With_cuda and WITH_TBB macros in effect, is on.

Since I am not familiar with the GPU part, first take a sample of a program to find a matrix transpose to do an example, the code is as follows:

#include <iostream> #include "cvconfig.h" #include "opencv2/core/core.hpp" #include "opencv2/gpu/gpu.hpp" #
Include "OPENCV2/CORE/INTERNAL.HPP"//for TBB wrappers using namespace std;
using namespace CV;

using namespace Cv::gpu;

struct Worker {void operator () (int device_id) const;};
    int main () {int num_devices = Getcudaenableddevicecount ();
        if (Num_devices < 2) {std::cout << "or more GPUs is required\n";
    return-1;
        } for (int i = 0; i < num_devices; ++i) {deviceinfo dev_info (i); if (!dev_info.iscompatible ()) {std::cout << "GPU module isn ' t built for GPU #" << i <& Lt  "(" << dev_info.name () << ", CC" << dev_info.majorversion () <<
            Dev_info.minorversion () << "\ n";
        return-1;
    }}//Execute calculation in threads using the GPUs int devices[] = {0, 1}; PARALLEL_DO (devices, devices + 2, Worker ());
return 0;

    } void Worker::operator () (int device_id) const {setdevice (DEVICE_ID);
    Mat src (+, cv_32f);

    Mat DST;
    RNG rng (0);

    Rng.fill (SRC, rng::uniform, 0, 1);

    CPU works transpose (SRC, DST);
    GPU works Gpumat d_src (SRC);
    Gpumat D_DST;

    Transpose (D_SRC, D_DST);
    Check results bool passed = Norm (Dst-mat (D_DST), norm_inf) < 1e-3; Std::cout << "GPU #" << device_id << "(" << deviceinfo (). Name () << "):" << ( Passed?

    "Passed": "FAILED") << Endl; DEALLOCATE data here, otherwise deallocation'll be a performed//after context was extracted from the stack d_s
    Rc.release ();
D_dst.release (); }

The above-mentioned content is not only superficial, but also appear more messy. Hope that the master after reading a lot of correct, and I do not quite understand the friend for reference only.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More