Several difficulties in multi-core programming and their countermeasures (Challenge 1)

Source: Internet
Author: User
Tags processing instruction

Http://blog.csdn.net/woshiqianlong125/article/details/6159671

Several difficulties in multi-core programming and their countermeasures (Challenge 1) Related Article Link: Load Balancing in multi-core programming; lock competition in multi-core programming; OpenMP parallel Program Design (ii) OpenMP parallel program design (I) Fast sorting efficiency on dual-core CPUs with the advent of multi-core CPUs, problems in multi-core programming will be placed on the programmer's agenda, many old Programmers think that there have been many CPU machines for a long time. The industry has accumulated a lot of experience in programming on multiple CPU machines. The programming on multi-core CPU should be similar, you only need to learn from the previous multi-task programming, parallel programming and parallel Algorithm Experience is enough. What I want to say is that the multi-core machines are very different from the previous multi-CPU machines. The previous multi-CPU machines were used in specific fields, such as servers, or some fields that can carry out large-scale parallel computing, these fields can easily take advantage of multiple CPUs, And now multi-core machines are applied to all aspects of common users, in particular, client machines need to use multi-core CPUs, and many client software may not have a server or a specific field for large-scale parallel computing to take advantage of multi-core parallel computing. When talking with Mr Meng about multi-core programming at the csdn conference, Mr Meng felt very pessimistic about the future of multi-core programming and his views on the prospect of multi-core programming were completely changed when he saw him last year. Mr. Meng has a deep understanding of multi-core programming. Due to time issues, he has not been able to have a deep discussion with Mr. Meng. On the way back, I re-thought about the difficulties in multi-core programming. Today, I quickly wrote it back and posted it to everyone. Problem 1: difficulties in serialization 1) Acceleration Coefficient When measuring the performance of a multi-processor system, a commonly used indicator is called the acceleration coefficient, defined as follows: S (p) = using a single processor execution time (the best Sequential Algorithm) /execution time required to use P Processors 2) amerda's Law In parallel processing, there is an amerda Law, which is expressed by the equation as follows: S (p) = p/(1 + (p-1) * f) Where s (P) the acceleration coefficient p indicates the number of processors. F indicates the proportion of the serial part in the execution time of the entire program. When F = 5%, P = 20, s (P) = 10.256 when F = 5%, P = 100, S (p) = 16.8 or so that as long as there is a 5% serial part, when the number of processors increases from 20 to 100, the acceleration coefficient can only increase from 10.256 to about 16.8, the number of processors increased by 5 times, and the speed increased by more than 60%. Even if the number of processors increases to an infinite number, the maximum acceleration coefficient is only 20. According to amerda's law, we can say that there is almost no development prospect for multi-core systems. Even if there is only 1% of non-parallel parts in the software, the maximum acceleration system can only reach 100, no more CPUs can improve the speed and performance. According to this law, it can be said that the development of multi-core CPU makes Moore's law continue to reach the limit in a few years. 3) Gustafson's Law Gustafson proposed a hypothesis different from amerda's law to prove that the acceleration coefficient can surpass amerda's law. Gustafson believes that the serial part of the software is fixed, it does not increase as the scale increases, and assumes that the execution time of the parallel processing part is fixed (this may be the case for server software ). Gustafson's law is described as follows: S (p) = p + (1-p) * FTS indicates the proportion of serial execution. If the serial proportion is 5%, the number of processors is 20, the acceleration coefficient is 20 + (1-20) * 5% = 19.05. If the serial ratio is 5% and the number of processors is 100, the acceleration coefficient is 100 + (1-100) * 5% = 95.05 the acceleration coefficient in Gustafson's law is almost proportional to the number of processors. If the actual situation complies with the assumption of Gustafson's law, the performance of the software will increase with the increase in the number of processors.4) serialization analysis in actual situations The calculation results of Amada's law and Gustafson's law differ so much that the reality is consistent with that law? I personally think that the reality is neither as pessimistic as Amada's law, nor as optimistic as Gustafson's law. Why? Perform a simple analysis. First, it is necessary to determine whether the content in the software cannot be parallelized in order to estimate the proportion of the serial part. In 1960s, Bernstein gave three conditions for parallel computing failure: Condition 1: after C1 writes a storage unit, C2 reads the data of the unit. Competition Condition 2: After C1 reads data from a storage unit, C2 writes the unit. Competition Condition 1: After C1 writes a storage unit, C2 writes the unit. The competition for "post-write" is called. None of the above three conditions can be executed in parallel. Unfortunately, there are a lot of phenomena in the actual software that meet the above conditions, that is, we often say that shared data needs to be locked for protection. If the number of tasks is fixed, the proportion of serialization decreases with the increase of the software scale, but unfortunately, it will increase with the increase in the number of tasks. That is to say, the larger the number of processors, the more serious the serialization caused by lock competition, as a result, the proportion of serialization increases dramatically with the increase in the number of processors. (I will explain the serialization intensification caused by lock competition in another article ). The serialization problem is a major challenge for multi-core programming. 5) possible solutions For the problems of serialization, the first solution is to use less locks or even adopt lockless programming. However, this is almost impossible for common programmers, because the lockless programming algorithms are too complex and prone to errors due to improper use, many lockless algorithms published in professional journals have been proved to be wrong, you can imagine how difficult it is. The second solution is to use atomic operations to replace locks. In essence, atomic operations do not solve the serialization problem, but they greatly increase the speed of serialization, this greatly reduces the execution time of serialization. However, chip manufacturers currently provide limited atomic operations and can only work in a few places. chip manufacturers may need to continue to work hard in this regard, more functions and more powerful atomic operations are provided to avoid the use of locks in more places. The third solution is to reduce the proportion of serialization at the design and algorithm levels. We may need to find practical parallel design patterns to reduce the use of locks. At present, the industry has accumulated some experience in this field, such as the task decomposition mode and data decomposition mode, data sharing mode: we believe that with the large-scale use of multi-core CPUs, more effective parallel design modes and algorithms will emerge in the future. The fourth solution is from the aspect of chip design. Because I have no idea about chip design, this solution may be just my wishful thinking. The main idea is to design some new commands on the chip layer. These commands are not completed by a single CPU as they were previously single-core CPU commands, but some parallel commands completed by parallel processing by multiple CPUs, so that programmers can call these parallel processing instruction programming just like writing a serialized program, however, it makes full use of the advantages of multiple CPUs. The author introduced: Zhou weiming, a freelancer, has been engaged in the software industry for more than 10 years. At present, we focus mainly on software testing, multi-core programming, software design, and other basic aspects. I have written the book "data structures and algorithms under multi-tasking". I am writing the book "software testing practices" and plan to write a multi-core programming book in the near future. References: the parallel programming model, Timothy Mattson, edited by Jack dongarra, Jack dongarra, and translated parallel programming by Mo zeyao, Barry Wilkinson, and translated by Lu Xinda 《 the multi-core programming technology, Shameem Akhter, edited by Chen Guoliang, translated by Li Baofeng, and translated parallel algorithm practices.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.