. NET parallel Programming-5. Pipeline Model Combat

Source: Internet
Author: User

I have a lot of in Excel to write the topic, but the recent busy (in fact, this is an excuse) ....

The previous article, ". NET parallel Programming-4. Implementing High-performance asynchronous queues describes the implementation of asynchronous queues, this article describes my real-world workers have encountered the processing of multithreading problems and based on the asynchronous queue underlying data structure of the solution.

The requirements are as follows: 1. Provide data Service write service for upper application call, the data write services processing throughput to reach 60w/s per second, that is, users send 60w of data per second and then write to the database through the data write service (database for the company's own research and development of the real-time database).

2. Try to simplify the service complexity of the upper application invocation.

first, analysis of performance bottlenecks:

1. The real-time database requires that the incoming data need to have an ID (similar to the self-increment primary key in the relational library), but the user only passes the name, needs to find the corresponding data ID according to the name to the real-time data, so the local frequent interaction with the database is the biggest bottleneck.

2. Because the database's original. Net API is two development based on the underlying C + + API, it is required. NET and C + + programming models (e.g.. The list,c++ is an array of the data stored in the NE, so put the. NET is converted to an array object in C + +, and the conversion takes a considerable amount of time, so the data conversion operation is also one of the performance bottlenecks.

Second, the solution:

1. For performance bottleneck 1 The local cache can be used, and the local cache maintains a direct mapping between name and database ID, so when looking for an ID, simply go to the local cache to find it. Since the cache reads far more than write, the cache part uses the normal dictionary plus read-write lock, read the data as long as the read lock, and new data with the write lock, so the following process design.

2. For the performance bottleneck 2 can be solved in parallel, our parallel model is divided into three kinds: data parallelism, task parallelism, and pipeline parallel, the basic introduction of these three models can be referred to the Intel Threading Building Blocks Programming Guide, the second chapter of parallel thinking. Some of the above basic theory support, so design the following models.

, each operation is performed in a separate thread, with the queue decoupled between the operation and the operation. Because the pipeline model is a typical space-time, the accumulation of a large amount of data between each operation, so that once the program crashes, so that the data in the queue is lost, so also to ensure high fault tolerance of the system, because the queue in the previous article and in the beginning of the code to consider the system exception handling details, So the complexity of exception handling and multithreaded programming is greatly reduced, and the number of threads per operation is also dynamically adaptable (ideally, dynamic load balancing is performed based on the processing power and CPU utilization of each operation, so dynamic load balancing is not developed due to the number of threads configured to meet the requirements).

third, on the expansion of

If the amount of data to be increased in the future, then the corresponding processing capacity will be increased. Then the simplest solution is to dynamically increase the number of threads per processing operation, because the operation of the queue for communication, it will inevitably design to lock problems, but also multi-threaded queue and multi-threaded out of the team will cause frequent context switching and performance degradation, so in order to avoid this multi-threaded competition situation, Can use a multi-pipeline, each pipeline process is the same, for the line between the load balance can be used to calculate the weighted average rotation, hash and other algorithms.


The overall time of the pipelining model is dependent on the time-consuming operation, so it is important to optimize the part in the coding and design process, so that it is necessary to minimize the duration of the operation when the pipeline model is used, as the most time-consuming part of this case is to write to the database part.

. NET parallel Programming-5. Pipeline Model Combat

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.