Shortly after the release of NCNN, bloggers tried to compile under iOS.
Encountered the problem of OpenMP compilation.
Find a variety of solutions without a result, personally fencing.
replace OpenMP with Std::thread.
NCNN Project Address:
Https://github.com/Tencent/ncnn
Later asked the author of Ncnn to know how to compile under iOS.
At this point, the interim plan at that time replaced OpenMP with Std::thread.
Consider, perhaps, in some specific cases or more applicable, the current convenience between the two to verify the switch.
Time to write a sample project.
Project Address:
Https://github.com/cpuimage/ParallelFor
Paste the full code:
#include <stdio.h>#include<stdlib.h>#include<iostream>#ifDefined (_OPENMP)//compile with:/openmp#include <omp.h>AutoConstEpoch =omp_get_wtime ();DoubleNow () {returnOmp_get_wtime ()-epoch;};#else #include<chrono>AutoConstEpoch =Std::chrono::steady_clock::now ();DoubleNow () {returnStd::chrono::d uration_cast<std::chrono::milliseconds> (Std::chrono::steady_clock::now ()-epoch). COUNT ()/1000.0;};#endifTemplate<typename fn>DoubleBenchConstFN &fn) {Auto took= -Now (); return(FN (), took +Now ());} #include<functional>#ifDefined (_OPENMP)# include<omp.h>#else #include<thread>#include<vector>#endif#ifdef _OPENMPStatic intProcessorCount = static_cast<int>(Omp_get_num_procs ());#elseStatic intProcessorCount = static_cast<int>(Std::thread::hardware_concurrency ());#endifStatic voidParallelFor (intInclusivefrom,intExclusiveto, std::function<void(size_t) >func) {#ifDefined (_OPENMP)#pragmaOMP parallel for num_threads (ProcessorCount) for(inti = Inclusivefrom; i < Exclusiveto; ++i) {func (i); } return;#else if(Inclusivefrom >=Exclusiveto)return; Staticsize_t thread_cnt =0; if(thread_cnt = =0) {thread_cnt=std::thread::hardware_concurrency (); } size_t Entry_per_thread= (Exclusiveto-inclusivefrom)/thread_cnt; if(Entry_per_thread <1) { for(inti = Inclusivefrom; i < Exclusiveto; ++i) {func (i); } return; } std::vector<std::thread>threads; intStart_idx, End_idx; for(Start_idx = Inclusivefrom; start_idx < Exclusiveto; Start_idx + =entry_per_thread) {End_idx= Start_idx +Entry_per_thread; if(End_idx >Exclusiveto) End_idx=Exclusiveto; Threads.emplace_back ([&] (size_t from, size_t to) { for(size_t Entry_idx = from; Entry_idx < to; ++entry_idx) func (ENTRY_IDX); }, Start_idx, END_IDX); } for(auto&t:threads) {T.join (); }#endif}voidTest_scale (intIDoubleBDouble*b) {A[i]=4*b[i];}intMain () {intN =10000; Double* A2 = (Double*)callocNsizeof(Double)); Double* A1 = (Double*)callocNsizeof(Double)); Double* B = (Double*)callocNsizeof(Double)); if(A1 = = NULL | | a2 = = NULL | | b = =NULL) { if(A1) { Free(A1); }if(A2) { Free(A2); }if(b) { Free(b); } return-1; } for(inti =0; i < N; i++) {A1[i]=i; A2[i]=i; B[i]=i; } DoubleBeforetime = Bench ([&] { for(inti =0; i < N; i++) {Test_scale (i, A1, B); } }); Std::cout<<"\nbefore:"<<int(Beforetime * +) <<"Ms"<<Std::endl; DoubleAftertime = Bench ([&] {parallelfor (0, N, [A2, b] (size_t i) {Test_scale (i, a2, b); }); }); Std::cout<<"\nafter:"<<int(Aftertime * +) <<"Ms"<<Std::endl; for(inti =0; i < N; i++) { if(A1[i]! =A2[i]) {printf ("error%f:%f \ t", A1[i], a2[i]); GetChar (); } } Free(A1); Free(A2); Free(b); GetChar (); return 0;}
To use OpenMP, add a compile option/openmp or define _OPENMP.
Recommended C++11 compilation.
The sample code is relatively straightforward.
NCNN code modification Examples are as follows:
#pragma omp parallel for for (int q=0; q<channels; q++) { const Mat m = Src.channel (q); = Dst.channel (q); Copy_make_border_image (M, Borderm, top, left, type, V); }
Switch
ParallelFor (0, channels, [&] (int q) { { const Mat m = Src.channel (q); = Dst.channel (q); Copy_make_border_image (M, Borderm, top, left, type, V); });
Originally planned to take some time to change the whole ncnn, send a revised version out.
Think about it or stick it out and give it to someone who needs it.
Get your hands on your own.
If you have other related questions or needs, you can contact me to discuss the email.
e-mail address is:
[Email protected]
An OpenMP asynchronous processing method that modifies ncnn with C + + sample code