The concept and usage of thread parallelization

Source: Internet
Author: User

Original link

Many movies have explored the great things that only 100% of humans can do to make the most of their brain's cognitive abilities. Despite the fact that only 10% of the brain's utilization rate is still there, the truth is that everyday activities are done with almost every part of the organic whole of the human brain. 1 in fact, although the brain accounts for only 3% of the body's weight, it consumes 20% of the body's energy.

The view that there is a large untapped potential for human beings is very attractive and may indeed exist. Interestingly, when it comes to high-performance computing (HPC) and leveraging the power of current hardware, we can use the metaphor of sleep processing power (dormant processing power).

Now, if you're reading this article and thinking about how to unleash the full potential of modern HPC hardware, consider updating your code. For optimal performance and long-term sustainability, consider a three-level approach to parallel programming: multithreading, distributed parallelism, and vectorization.

Expand your current and future hardware

If you can use multi-tier parallel algorithms to take advantage of the parallel features of modern hardware, you will be able to effectively extend your current and future hardware.

"Implementing code Modernization and better parallelization will clearly become an important investment area for the future." James Reinders, chief lecturer and director of Intel Corporation

Improve software performance with multithreading

multithreading (or thread parallelism) provides developers with a good chance to get started and help them significantly improve software performance when using multicore processors. With thread parallelism, you can create and distribute threads to the kernel, which means that you can build collaborative threads that enable single-process communication through shared memory and work together on large tasks. In particular, if you are able to optimize your code to have independent calculations and accommodate your nodes, the line threaded will help you improve execution speed and maintain that code at any time.

The easy way to do this is to add openmp* compilation instructions (assuming that the code is written in C + + or fortran*) so that these parts of the code run in parallel. The most common example of this scenario is passing loops through independent computations. In this way, the program itself can generate execution threads that are executed by multiple cores on the system to run separately. To share data between execution threads, simply write the shared memory and read from it. (However, it is important to note that this part of the process must be done with care to ensure that the answer is correct and that the race condition is avoided; for example, when more than one thread tries to access and change shared data at the same time.) )

Distributed parallelization for large datasets on multiple machines

Threading is a relatively simple starting point compared to distributed parallelization (or multi-node optimizations). In the process of distributed parallelization, the same code runs independently on different machines and implements data sharing through message delivery. Using the message sharing data is part of the algorithm design and must be carefully constructed to ensure that no data is lost and there is no process waiting for messages that will not arrive. Distributed parallelization is ideal for situations where a single machine cannot accommodate large datasets, and the compute operations can be distributed to handle subsets of data.

Compute-intensive Workload vectorization

When dealing with computationally intensive workloads, such as genomics applications or large numerical computations, you might consider vectorization . In this process, the same calculation instruction executes on multiple data segments (called SIMD-single instructions, multiple data) in a single core. This calculation technique can increase the performance of the calculation by one, twice times, four times times, eight times times. (depending on the size of the vector register on the kernel). This performance improvement can be achieved even if the code uses threads.

multithreaded virtualization can significantly improve the responsiveness of GUI applications CADEX Ltd. successfully improved the performance of multi-core systems using the multithreading algorithm. Learn how >

Vipo saves time understanding the best vectorization practices of Robert Geva, chief engineer and manager of the Financial Services Design division. Learn More >

Get the tools and resources you need to build modern code

The Intel Quick Reference Guide provides more detailed information on how to develop multithreaded applications.

Intel also offers a number of useful resources to help you build modern code to take advantage of parallelism at all levels. These useful resources include training, code samples, case studies, developer kits, access to hands-on labs webinars, and more. We'll help you get the most out of your code and master the Code strategy to take advantage of intel®-based architectures. Learn more about the Intel Modern Code program.

1 "All your need to Know on the Percent Brain myth, in Seconds." Wired.com/2014/07/everything-you-need-to-know-abo ut-the-10-brain-myth-explained-in-60-seconds/

The concept and usage of thread parallelization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.