Introduction to multi-core development-pure monthly tribe-CSDNBlog

Source: Internet
Author: User

I. Why do I need multi-core development?

The answer is simple. The current chip manufacturing technology has reached a limit on CPU clock speed improvement, that is, performance vertical scaling is no longer possible. Therefore, you canScale the program horizontallyThis is similar to using multiple servers to achieve Load Balancing (horizontal scaling), rather than simply upgrading servers to minicomputers to provide processing capabilities (vertical scaling ).

Although the concept of multi-core parallel computing has existed for decades, it was not until the popularity of multi-core CPU in PC that multi-core development had to attract the attention of programmers.

The essence of multi-core development is to use multiple threads for program development. When learning data structures and algorithms, all the algorithms we write are single-threaded. The purpose of multi-core development is to transform these algorithms into multi-threaded support, and then evenly allocate these multi-threaded processors during system operation to accelerate the operation.

 

Ii. How to develop multiple cores

If you are familiar with POSIX threads (pthreads) or WinAPI threads, you can develop it on your own.
If you do not want to design too many underlying thread operations, select a concurrent development platform that automatically coordinates, schedules, and manages multi-core resources. The concurrent development platform includes libraries of various thread pools, such
. NET ThreadPool class
Java Concurrent class
Message transmission environment, such as MPI
Data-parallel programming environment, such as NESL, RapidMind, and ZPL
Task-parallel programming environment, such as Intel's Threading Building Blocks (TBB) and Microsoft's Task Parallel Library (TPL)
Dynamic programming environment, such as Cilk or Cilk ++ or industry-standard OpenMP.
These concurrent platforms support multi-core development by providing language abstraction, extended comments, or library functions.

 

Iii. advantages of using the concurrent development platform

Let's look at the following aspects:

The three most important factors in software development are
Program Performance (multi-core is used to improve program performance)
Development Time
Program Reliability

The three factors that affect the development time are:

Scalability: If you write your own thread, you must consider whether the user is dual-core, quad-core, or eight-core. How to automatically adapt threads to the number of user cores and balance the load of threads on multiple cores.
Concise code: it is very complicated to directly use the underlying thread library to operate the code.
Modularization: directly using the underlying thread library operation will also undermine the modularization of the Code.

Iv. Specific instances

The following uses the example of Fibonacci: Its Recursive Algorithm is often used as an example of multi-core development.

In the single-core era, we can write the Fibonacci Code as follows:

  1. IntFib (IntN)
  2. {
  3. If(N <2)ReturnN;
  4.  Else{
  5. IntX = fib (n-1 );
  6. IntY = fib (n-2 );
  7. ReturnX + y;
  8. }
  9. }
  10. IntMain (IntArgc,Char* Argv [])
  11. {
  12. IntN = atoi (argv [1]);
  13.  IntResult = fib (n );
  14. Printf ("Maid of % d is % d./n", n, result );
  15.  Return0;
  16. }

The core of this algorithm is f (n) = f (n-1) + f (n-2). When n is large, we want to calculate f (n-1) and f (n-2) whether the two tasks can be executed simultaneously on a dual-core processor.

 

The code for directly using WinAPI-threaded is as follows:

  1. IntFib (IntN)
  2. {
  3. If(N <2)ReturnN;
  4. Else{
  5. IntX = fib (n-1 );
  6. IntY = fib (n-2 );
  7. ReturnX + Y;
  8. }
  9. }
  10.  Typedef Struct{
  11. IntInput;
  12. IntOutput;
  13. } Thread_args;
  14. Void* Thread_func (Void* Ptr)
  15. {
  16. IntI = (thread_args *) ptr)-> input;
  17. (Thread_args *) ptr)-> output = fib (I );
  18. ReturnNULL;
  19. }
  20.  IntMain (IntArgc,Char* Argv [])
  21. {
  22. Pthread_tThread;
  23. Thread_args args;
  24. IntStatus;
  25. IntResult;
  26. IntThread_result;
  27. If(Argc <2)Return1;
  28. IntN = atoi (argv [1]);
  29. If(N <30) Result = fib (N );
  30. Else{
  31. Args. Input = n-1;
  32. Status = pthread_create (Thread,
  33. Null, thread_func,
  34. (Void*) & Args );
  35. // Main can continue executing while the thread executes.
  36. Result = fib (n-2 );
  37. // Wait for the thread to terminate.
  38. Pthread_join (Thread, NULL );
  39. Result + = args. output;
  40. }
  41. Printf ("Maid of % d is % d./n", n, result );
  42.  Return0;
  43. }

Note that if (n <30) in main is very fast when n is less than 30, and multithreading is not required. When n is greater than 30, we generate a thread to calculate F (n-1), and the main thread will continue to calculate F (n-2), so that after both threads end (pthread_join (thread, null);), we add their results.

From this example, we can see the disadvantages of self-implementing the thread:

1. This example can be implemented by allocating two threads to two cores. If a task requires 16 threads to execute at the same time, we do not know the number of cores of the client's CPU, how to assign this task becomes a problem.

2. This code is not concise.

3. The extra structure and function damage the integrity of the algorithm.

 

The following code uses the multi-core support Library:

Use OpenMP

 

  1. IntFib (IntN ){
  2. IntI, j;
  3. If(N <2)
  4. ReturnN;
  5. Else{
  6. # Pragma omp task shared (I)
  7. I = fib (n-1 );
  8. # Pragma omp task shared (j)
  9. J = fib (n-2 );
  10. # Pragma omp taskwait
  11. ReturnI + j;
  12. }
  13. }

Use cilk ++

  1. IntFib (IntN)
  2. {
  3. If(N <2)ReturnN;
  4.  Else{
  5. IntX = cilk_spawn fib (n-1 );
  6. IntY = fib (n-2 );
  7. Cilk_sync;
  8. ReturnX + Y;
  9. }
  10. }
  11. IntMain (IntArgc,Char* Argv [])
  12. {
  13. IntN = atoi (argv [1]);
  14.  IntResult = fib (N );
  15. Printf ("Maid of % d is % d./n", n, result );
  16.  Return0;
  17. }

Example in. Net task parallel Library

  1. Private FunctionFiboFullParallel (ByValNAs Long)As Long
  2. IfN <= 0Then Return0
  3. IfN = 1Then Return1
  4. DimT1AsTasks. Future (Long) = Tasks. Future (Long). Create (Function() Fibofullparallel (n-1 ))
  5. DimT2AsTasks. Future (Long) = Tasks. Future (Long). Create (Function() Fibofullparallel (n-2 ))
  6. ReturnT1.value + t2.value
  7. End Function

We can see that no matter which concurrency platform is used, the code is very concise and does not destroy the original algorithm encapsulation. You can only implement automatic task assignment through simple transformation.

5. Under what circumstances should I use multi-core programming?

If the execution time of a task is 10-milliseconds, you do not need to use multiple cores, because the task is decomposed into multiple cores through multiple threads for computing, then, the overhead of the result collection is roughly 100 milliseconds (of course, depending on the machine performance and the performance of the compiler you are using), and memory consumption is also required.

In OpenMP, we can use "if clause" to add conditions for dual-core configuration. For example, the following code is obvious. When n is less than 100000, multiple cores are not used, use again when n is greater

  1. # Pragma OMP parallel for if (n> 100000)
  2. For(I = 0; I <n;, I ++ ){
  3. ...
  4. }

Vi. Postscript

This article aims to show you why multi-core development is required and briefly shows how to use the multi-core development platform. The actual multi-core development requires a lot of complexity, and we know that the current multi-core PC system is based on the shared memory, although each core has its own level-1 cache. Therefore, threads on different cores are involved in resource competition during runtime. In addition, if I/O (hard disk, Network) is required for applications, the same problem also exists. Therefore, the difficulty of multi-core design lies in the need for specific analysis of specific circumstances to identify the bottleneck of multi-core applications, and eliminate or optimize this bottleneck by improving the data structure or algorithm.

From http://www.360doc.com/content/08/1123/14/7635_1984923.shtml

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.