. NET parallel programming-6. Common optimization Strategies

Source: Internet
Author: User

This article is. NET in parallel programming sixth, today introduce some of my practical projects in some of the common optimization strategies .

One, avoid sharing data between threads

Avoid sharing data between threads mainly because of the lock problem, no matter what the granularity of the lock, the best way to synchronize between the threads is not locked, the main measure of this place is to find out where the data between the need to share data and do not need to share data, and then design to avoid sharing data between multithreading.

In a previously done project, the program was designed at the beginning:

At the beginning of the design time all the data are put into the public queue, and then the queue notifies multiple threads to process the data, the queue uses mutual exclusion lock to ensure thread synchronization, the result is the thread to switch frequently up and down, CPU time is wasted on the context switch so that the queue congestion can not process the data in time, The final program is out of memory because it cannot process data.

The first solution to this problem is the thought of a more granular lock such as atomic operations, but the use of atomic operations and the fallback mechanism to ensure that the live lock problem and to prevent the waste of CPU time and so on a series of problems. , if the development is done in fact, it is equivalent to implementing a mutex such as lock. So we re-organized the business requirements, the modified design is as follows:

In the data input after the use of a separate data forwarder distributed to different queues, so as to avoid the competition between the threads, data forwarders in fact, the equivalent of our web site to use the load balancer, but also according to the data in the queue to choose the different queue or can be combined with the CPU, Dynamic scheduling of memory utilization. The improved design, of course, meets the performance requirements.

So in the multi-threaded development, try to avoid the use of locks, for must use the lock the situation to choose the appropriate lock, for the choice of what type of lock, my principle is: to meet the performance requirements, do not deliberately pursue fine-grained lock. Coarse-grained lock performance is low but easy to use and understand, fine-grained lock performance is high but difficult to use and understand, about the operating system of the lock can be referred to the Windows core programming thread Synchronization section, under the. NET platform can also refer to the CLR VIA C # thread related chapters.

Second, pay attention to the CPU cache failure, avoid frequent context switching.

caching is often the key to improving performance when developing multicore programs, where the cache refers to the CPU cache (L1,L2,L3). Caching is often used properly to improve performance by more than twice times.

1. Causes of CPU cache failure are:

(1) Frequent modification of in-memory data

(2) Using the atomic operation, lock lock and other synchronization methods.

(3) switch of thread context.

(4) pseudo-sharing causes frequent flush caches.

2. The reasons for frequent context switching are:

(1) The thread of the program itself is robbing CPU resources, that is, the CPU cannot dispatch other threads,

(2) Many threads are waiting for a mutex to be acquired, but only one thread can get it, and the other threads are constantly switching between wake and sleep.

3. Solutions for the above problems (for reference only, different programs for the project):

(1) Avoid using any type of lock.

(2) In the premise of satisfying performance, with the least number of threads to do the least thing.

(3) Redesign to avoid the modification of data, such as the previously developed real-time computing programs, all the data is only allowed to read not allowed to modify, need to modify the creation of a new data to replace, similar to the development of the Erlang program.

Specific information about caching basics can be found in the "in-depth understanding of computer operating Systems" in chapter sixth of the content.

Third, the thread pool of the use of the scene and attention points:

(1) For tasks with short execution times should be given to the thread pool for processing instead of opening a new thread, for tasks that require lengthy processing to be handled by a separate thread.

(2) The task of reading and writing files is not given to the thread pool, because thread Cheng threads belong to the background thread, and the application closes unexpectedly and the thread loses data.

(3) Never develop a thread pool on your own, it takes months and a lot of testing to actually use the product-level thread pools, or you will encounter problems too late.

Iv. optimization of the NUMA architecture machine.

(1) Modern servers are basically NUMA architectures, where we'd better open the. NET Server garbage collection mode on these machines so that when we allocate objects, we can allocate them in the memory of the CPU that we recently used.

(2) Do not use binding threads in. NET programs to specify the kernel or promote thread precedence, because performance is slower if the bound thread is competing against the garbage collection thread.

V. Choosing the RIGHT programming model

Writing parallel programs has a fixed programming model, and basically other models are the free combination of these models, a common programming model:

(1) Data parallelism

(2) Task parallelism

(3) Pipeline parallel

Six, go to the queue Lira data or queue actively push data?

in general, we write multi-threaded program will put the data in the queue first, then the function immediately return, and then other threads to deal with the purpose of fast response to the client, which involves the queue to send a signal to notify the thread processing or thread timing to go to the queue to fetch data, If the use of push can cause state loss and eventually some data is not processed and stay in the queue, if the use of pull the thread sleep time is not good grasp, sleep more data processing slow, sleep less and waste CPU, so I generally use the combination of the two ways, specific reference my fourth article. NET parallel Programming-4. Implementing High-performance asynchronous queues

Seven, asynchronous IO or synchronous IO?

Asynchronous IO solves the thread-blocking problem of synchronous IO (there are two types of IO: disk and network), and basically all Web servers use asynchronous network IO, but it is best not to use asynchronous disk IO for disk, except on machines with SSDs. Because asynchronous read-write to disk can cause fragmentation of disk storage data, an operation that could have been sequentially written may eventually become a random write.

Conclusion:

The design of this article is only a small part of the parallel program optimization, more content also need us to accumulate in practice.

What was intended to be written was ". NET parallelism", and the results were not found to be related to. NET, of course, these basic knowledge and language are not very much related.

The content of this article is only suggested, we should not be dogmatic in writing the program, reasonable is the best, some principles are not suitable for all situations, we do is to constantly explore adapt to change to achieve the purpose of improving program performance.

. NET parallel programming-6. Common optimization Strategies

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.