How to use VS's code optimization and OpenMP parallel computing to improve program run speed

Source: Internet
Author: User

The previous use of multithreading for a larger number of computational programs to speed up, but also know that multi-threaded synchronization and program debugging is a big pit, recently for the laboratory project to learn a little vs under the optimization of code settings and the use of OpenMP acceleration operations, are some basic ways to improve program speed, You can speed up the program by modifying the code and settings slightly. With multi-threading, the clock () function verifies that my program runs up to 60% faster, as detailed below:

Code optimization :
    • Properties, configuration Properties->c/c++-> code generation: Enable enhanced instruction set, optional streaming SIMD Extensions 2 (/ARCH:SSE2) (/ARCH:SSE2), Streaming SIMD Extensions 2 (/arch:sse2) (/arch:ss E2) for accelerated floating-point models with fast (/fp:fast) acceleration for floating-point data operations
    • Properties---Configuration Properties->c/c++-> optimization: Optional for maximum speed (/o2) optimization. The Full program Optimization option is (/GL), cannot be set in debug version, must be in release version
OpenMP Parallel Computing : Under VS2012, the project Properties-"c/c++-" language, OpenMP support, and, optionally, the header file "Omp.h", OpenMP is a good choice for multithreaded programming based on data diversity.
    • OpenMP Common directives
      • Parallel: Used before a piece of code to indicate that the code will be executed in parallel by multiple threads
      • For: Used in parallel execution in a for loop before a loop is allocated to multiple threads, it must be ensured that there is no correlation between each loop
      • The combination of the parallel for:parallel and the For statement is also used before a for loop, and the code that represents the for loop is executed in parallel by multiple threads
      • Sections: Used before code snippets that might be executed in parallel
      • Parallel combination of two Sections:parallel and sections statements
      • Critical: Used before a section of code critical section
      • Single: Used in a section of code that is executed only by one thread, indicating that the following code snippet will be executed single-threaded
      • Barrier: Thread synchronization for code in the parallel region, all threads are executed to barrier to stop until all threads are executed to barrier and continue down
      • Atomic: Used to specify that a piece of memory area is braking updated
      • Master: Used to specify a block of code to be executed by the main thread
      • Ordered: Loops used to specify parallel regions are executed sequentially
      • Threadprivate: Used to specify that a variable is thread private
    • OpenMP In addition to the above directives, there are some library functions, the following list of several commonly used library functions:
      • Omp_get_num_procs: Returns the number of multiprocessor processors running this thread
      • Omp_get_num_threads: Returns the number of active threads in the current parallel region
      • Omp_get_thread_num: Return thread number
      • Omp_set_num_threads: Setting the number of threads when executing code in parallel
      • Omp_init_lock: Initializing a simple lock
      • Omp_set_lock: Lock operation
      • Omp_unset_lock: Unlocking operation to be paired with the Omp_set_lock function
      • Omp_destroy_lock:omp_init_lock function of pairing operation function, close a lock
    • Parallel instruction Usage
<blockquote style= "margin:0px 0px 0px 40px; Border:none; padding:0px; " ><pre name= "code" class= "cpp" > #pragma omp parallel num_threads (8)
{
    printf ("Hello, world!, threadid=%d\n", Omp_get_thread_num ());
}

 
 
The
printf function is created by 8 threads to execute, and the order in which each thread executes is not deterministic. Compared to the traditional create thread function, OpenMP is the equivalent of a thread entry function that repeatedly invokes the CREATE thread function to create a thread and wait for the thread to finish executing. If you remove num_threads (8) from the above code to specify the number of threads, the number of threads will be created based on the actual number of CPU cores.
    • For instruction Usage
<blockquote style=" margin:0px 0px 0px 40px; border:none; padding:0px; " ><pre name= "code" class= "cpp" > #pragma omp parallel for 
for (int j = 0; J < 4; j + +) 
{
printf (" j =%d, ThreadId =%d\n ", j); 
 

 

the statements for the for loop are assigned to separate threads that are executed separately. It is important to note that if you do not add the parallel keyword, then four loops will be executed in the same thread, there must be no correlation between loops, and the variables are best defined within the loop.
    • Usage of sections and sections
<blockquote style= "margin:0px 0px 0px 40px; Border:none; padding:0px; " ><pre name= "code" class= "cpp" > #pragma omp parallel sections
{
#pragma omp section
<span style= "White-space:pre" ></span>printf ("section 1 ThreadId =%d\n", Omp_get_thread_num ());
#pragma omp section
<span style= "White-space:pre" ></span>printf ("section 2 ThreadId =%d\n", Omp_get_thread_num ());
#pragma omp section
<span style= "White-space:pre" ></span>printf ("section 3 ThreadId =%d\n", Omp_get_thread_num ());
#pragma omp section
<span style= "White-space:pre" ></span>printf ("section 4 ThreadId =%d\n", Omp_get_thread_num ());
}

The
code inside each section is executed in parallel (assigned to a different thread). When using the section statement, it is important to note that this is a way to ensure that the code execution time in each section is not quite the same, otherwise the section execution time is too much longer than the other section to achieve the effect of parallel execution.




Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

How to use VS's code optimization and OpenMP parallel computing to improve program run speed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.