In layman's Java Concurrency (40): Concurrent Summary Part 4 performance and scalability [go]

Source: Internet
Author: User

Performance and Scalability

One way to use threading is to improve performance. Multithreading can make the program take full advantage of idle resources, improve resource utilization, and be able to process tasks in parallel and improve the responsiveness of the system. But obviously, the introduction of threading also introduces the complexity of the system. In addition, the performance of the system is not always increased as the number of threads increases.

Performance and Scalability

Performance improvements often mean more things can be done with fewer resources. Resources here include what we often call CPU cycles, memory, network bandwidth, disk IO, databases, Web services, and so on. Multithreading can make full use of the advantages of multi-core, take advantage of latency caused by IO blocking, and reduce the impact of network overhead, thus improving the response efficiency in unit time.

In order to improve performance, we need to make efficient use of our existing processing resources, but also to open up new resources available. For example, for CPUs, ideally, you want the CPU to work at full capacity. Of course, the full load of work here refers to doing useful things, rather than needlessly dying loops or waiting. Constrained by CPU computing power, if the CPU reaches its limit, it is clear that we are taking full advantage of computing power. For IO (memory, disk, network, etc.), the utilization of these resources goes up if the bandwidth is reached. Ideally, all resources are exhausted, and the system's performance reaches its maximum value.

In order to measure the performance of the system, there are some indicators for qualitative and quantitative analysis. such as service time, wait time, throughput, efficiency, scalability, amount of build, and so on. Service time, wait time, etc. are used to measure the efficiency of the system, that is, how fast. Throughput, build, and so on measure the capacity of the system, that is, how much data can be processed. In addition, effective service time and interruption time are used for the reliability and stability of the capability system.

Scalability means increased computational resources, throughput, and output that are correspondingly improved. From an algorithmic standpoint, the corresponding performance is often measured in terms of complexity. such as time complexity, space complexity, and so on.

Amdahl Law

Parallel tasks increase resources obviously improve performance, but if it is a serial task, adding resources does not necessarily get a reasonable performance boost. The Amdahl law describes the increase in the ratio of processor resources to system rows in a system. Suppose that in a system F is the proportion that must be serialized to execute, N is the processor resource, then the acceleration ratio increases as the N increases:

Theoretically, when n approaches infinity, the acceleration is infinitely closer to the F-f than the maximum value. This means that if a program has a serialization specific gravity of 50%, then the maximum speedup after parallelization is twice times.

Speedup can also be used to measure CPU resource utilization In addition to the rate at which it can be used for acceleration. If the resource utilization of each CPU is 100%, then the acceleration ratio should be doubled each time the CPU's resources are doubled. In fact, in a system with 10 processors, if 10% of the program is serialized, the maximum speedup is 1/(0.1+ (1-0.1)/10) = 5.3 times times, in other words, CPU utilization is only 5.3/10=53%. And if the processor increases to 100 times times, then the speedup is 9.2 times times, which means the CPU utilization is only 9.3%.

Obviously increasing the number of CPUs does not improve CPU utilization. Describes the speedup of a system with different serialization weights as the number of CPUs increases.

Obviously, the larger the serial weighting, the less noticeable is the effect of increasing CPU resources.

Performance improvements

Performance improvements can be started in the following ways.

Resource utilization of the system platform

The resource utilization of a program on a system platform refers to the rate at which a device is busy and is serving this program for all time. From the point of view of physics, it is similar to the working ratio. Simply put: resource utilization = Active busy time/total time spent.

Also say as far as possible to make the device to do useful work, while extracting its maximum value. Useless loops can cause CPU 100% usage, but not necessarily work effectively. Effectiveness is often difficult to measure, usually only by subjective evaluation, or by the behavior of the optimized program to determine whether the effectiveness is improved.

Delay

The delay describes the time it takes to complete the task. Latency is sometimes also a response time. If there are multiple parallel operations, then the delay depends on the most time consuming task.

Multi-processing

Multi-processing refers to the ability to execute multiple processes or multiple programs concurrently on a single system. The benefit of the multi-processing capability is that throughput can be increased. Multi-processing can effectively utilize the resources of multicore CPUs.

Multithreading

Multithreading describes the process of executing multiple threads concurrently in the same address space. These threads have different execution paths and different stack structures. The concurrency we're talking about is more about threading.

Concurrency of

Executing multiple programs at the same time or tasks is called concurrency. Multitasking within a single program, or multitasking between multiple programs, is considered concurrent.

Throughput

Throughput measures the total amount of work that a system can accomplish within a unit. For hardware systems, throughput is the upper limit for physical media. Improving the throughput of your system can also significantly improve performance before physical media is reached. Throughput is also an indicator of performance.

Bottleneck

The worst-performing part of the program during operation. In general, serial IO, disk IO, memory unit allocations, network IO, and so on, can cause bottlenecks. Some algorithms that are used too often can also be bottlenecks.

Scalability

The scalability here mainly refers to the ability of a program or system to increase performance by increasing the resources available to it.

Thread Overhead

Assuming that the introduction of multithreading is used for computing, then performance will certainly have a significant increase? In fact, the introduction of multithreading will also introduce more overhead.

Toggle Context

If the number of threads that can be run is greater than the number of cores of the CPU, then the OS will forcibly switch the running thread based on a certain scheduling algorithm, allowing other threads to use CPU cycles.

Switching threads can cause context switching. The scheduling of threads causes the CPU to spend more time fragments between the operating system and the process, thus reducing the time to actually execute the application. In addition, context switching can lead to frequent cache access, and for a thread that has just been switched, it may become slower due to the lack of data in the cache, resulting in more IO overhead.

Memory synchronization

For data synchronization between different threads, the synchronized and volatile provided visibility will cause the cache to fail. The data between the line stacks is synchronized with main memory, which has some small overhead. If data synchronization occurs between threads at the same time, these synchronized threads may be blocked.

Blocking

When a lock race occurs, the failed thread causes blocking. Normally blocked threads may spin-wait inside the JVM or be suspended by the operating system. Spin waits can cause more CPU slicing to be wasted, while operating system hangs can cause more context switching.

Knowing several aspects of performance improvement, and understanding the overhead of performance, applications are subject to trade-offs and assessments based on actual scenarios. Without once and for all optimization schemes, continuous small-scale improvement and adjustment is an effective means to improve performance. Some of the current large architectural tweaks can also lead to higher performance gains.

The simple principle is to ensure that the logic is correct in small cases, to find performance bottlenecks, steps to improve and optimize.

Resources
    • Amdahl ' s Law:http://en.wikipedia.org/wiki/amdahl%27s_law
    • Gustafson ' s Law:http://en.wikipedia.org/wiki/gustafson%27s_law
    • Sun-ni Law:http://en.wikipedia.org/wiki/sun-ni_law
    • Acceleration ratio analysis of three typical lock competitions in multicore systems http://blog.csdn.net/drzhouweiming/article/details/1800319
    • The equivalence of Amdahl Law and Gustafson law http://book.51cto.com/art/201004/197506.htm

In layman's Java Concurrency (40): Concurrent Summary Part 4 performance and scalability [go]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.