An OpenMP asynchronous processing method that modifies ncnn with C + + sample code

Source: Internet
Author: User

Shortly after the release of NCNN, bloggers tried to compile under iOS.

Encountered the problem of OpenMP compilation.

Find a variety of solutions without a result, personally fencing.

replace OpenMP with Std::thread.

NCNN Project Address:

Https://github.com/Tencent/ncnn

Later asked the author of Ncnn to know how to compile under iOS.

At this point, the interim plan at that time replaced OpenMP with Std::thread.

Consider, perhaps, in some specific cases or more applicable, the current convenience between the two to verify the switch.

Time to write a sample project.

Project Address:

Https://github.com/cpuimage/ParallelFor

Paste the full code:

#include <stdio.h>#include<stdlib.h>#include<iostream>#ifDefined (_OPENMP)//compile with:/openmp#include <omp.h>AutoConstEpoch =omp_get_wtime ();DoubleNow () {returnOmp_get_wtime ()-epoch;};#else #include<chrono>AutoConstEpoch =Std::chrono::steady_clock::now ();DoubleNow () {returnStd::chrono::d uration_cast<std::chrono::milliseconds> (Std::chrono::steady_clock::now ()-epoch). COUNT ()/1000.0;};#endifTemplate<typename fn>DoubleBenchConstFN &fn) {Auto took= -Now (); return(FN (), took +Now ());} #include<functional>#ifDefined (_OPENMP)# include<omp.h>#else #include<thread>#include<vector>#endif#ifdef _OPENMPStatic intProcessorCount = static_cast<int>(Omp_get_num_procs ());#elseStatic intProcessorCount = static_cast<int>(Std::thread::hardware_concurrency ());#endifStatic voidParallelFor (intInclusivefrom,intExclusiveto, std::function<void(size_t) >func) {#ifDefined (_OPENMP)#pragmaOMP parallel for num_threads (ProcessorCount) for(inti = Inclusivefrom; i < Exclusiveto; ++i) {func (i); }    return;#else      if(Inclusivefrom >=Exclusiveto)return; Staticsize_t thread_cnt =0; if(thread_cnt = =0) {thread_cnt=std::thread::hardware_concurrency (); } size_t Entry_per_thread= (Exclusiveto-inclusivefrom)/thread_cnt; if(Entry_per_thread <1)    {         for(inti = Inclusivefrom; i < Exclusiveto; ++i) {func (i); }        return; } std::vector<std::thread>threads; intStart_idx, End_idx;  for(Start_idx = Inclusivefrom; start_idx < Exclusiveto; Start_idx + =entry_per_thread) {End_idx= Start_idx +Entry_per_thread; if(End_idx >Exclusiveto) End_idx=Exclusiveto; Threads.emplace_back ([&] (size_t from, size_t to) {             for(size_t Entry_idx = from; Entry_idx < to; ++entry_idx) func (ENTRY_IDX);    }, Start_idx, END_IDX); }     for(auto&t:threads)    {T.join (); }#endif}voidTest_scale (intIDoubleBDouble*b) {A[i]=4*b[i];}intMain () {intN =10000; Double* A2 = (Double*)callocNsizeof(Double)); Double* A1 = (Double*)callocNsizeof(Double)); Double* B = (Double*)callocNsizeof(Double)); if(A1 = = NULL | | a2 = = NULL | | b = =NULL) {        if(A1) { Free(A1); }if(A2) { Free(A2); }if(b) { Free(b); }        return-1; }     for(inti =0; i < N; i++) {A1[i]=i; A2[i]=i; B[i]=i; }    DoubleBeforetime = Bench ([&] {         for(inti =0; i < N; i++) {Test_scale (i, A1, B);    }    }); Std::cout<<"\nbefore:"<<int(Beforetime * +) <<"Ms"<<Std::endl; DoubleAftertime = Bench ([&] {parallelfor (0, N, [A2, b] (size_t i) {Test_scale (i, a2, b);    });    }); Std::cout<<"\nafter:"<<int(Aftertime * +) <<"Ms"<<Std::endl;  for(inti =0; i < N; i++)    {        if(A1[i]! =A2[i]) {printf ("error%f:%f \ t", A1[i], a2[i]);        GetChar (); }    }     Free(A1);  Free(A2);  Free(b);    GetChar (); return 0;}

To use OpenMP, add a compile option/openmp or define _OPENMP.

Recommended C++11 compilation.

The sample code is relatively straightforward.

NCNN code modification Examples are as follows:

   #pragma omp parallel for for         (int q=0; q<channels; q++)        {             const Mat m = Src.channel (q);             = Dst.channel (q);            Copy_make_border_image (M, Borderm, top, left, type, V);        }

Switch

    ParallelFor (0, channels, [&] (int  q) {                {                    const Mat m =  Src.channel (q);                     = Dst.channel (q);                    Copy_make_border_image (M, Borderm, top, left, type, V);                });

Originally planned to take some time to change the whole ncnn, send a revised version out.

Think about it or stick it out and give it to someone who needs it.

Get your hands on your own.

If you have other related questions or needs, you can contact me to discuss the email.

e-mail address is:
[Email protected]

An OpenMP asynchronous processing method that modifies ncnn with C + + sample code

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.