9 Myths about Java performance

Source: Internet
Author: User
Tags benchmark

Java's performance has some sort of black magic. This is due in part to the complexity of the Java platform and the difficulty of locating problems in many cases. Yet there is a tendency in history for people to study Java performance by intelligence and experience, rather than by applying statistical and empirical reasoning. In this article, I want to debunk some of the most ridiculous technical myths.

1.Java is slow

There are many fallacies about the performance of Java, which is the most outdated and possibly the most obvious.

Indeed, Java was sometimes slow in the 90 's and early in the century.

Since then, however, virtual machines and JIT technology have improved over the past more than 10 years, and the overall performance of Java is now very good.

of the 6 independent Web performance benchmarks, the Java framework had 22 of the top four in 24 tests.

Although the JVM uses profiling to optimize only the code paths that are commonly used, this optimization effect is obvious. In many cases, JIT-compiled Java code is as fast as C + +, and this is a growing number of things.

Still, some people think the Java platform is slow, perhaps from the historical prejudices of people who have experienced earlier versions of the Java platform.

Before concluding, we recommend that you maintain an objective attitude and evaluate the latest performance results.

2. Single-line Java code can be viewed in isolation

Consider this short line of code:

MyObject obj = new MyObject ();

For Java developers, it seems obvious that this line of code is bound to allocate an object and invoke the appropriate constructor.

We may be able to introduce performance boundaries accordingly. We think that this line of code will certainly lead to a certain amount of work, and based on this presumption, you can try to calculate its performance impact.

In fact, this kind of cognition is wrong, it gives us the preconceived thought that no matter what work, in any case will be carried out.

In fact, both the Javac and the JIT compiler are able to optimize dead code. In the case of the JIT compiler, profiling data is based on performance, and you can even optimize the code by predicting it. In such cases, this line of code does not run at all, so it does not affect performance.

In addition, in some JVMs-such as the Jrockit--jit compiler-you can even decompose operations on objects so that even if the code path is valid, the allocation operation can be avoided.

The implication here is that context is very important when dealing with Java performance issues, and premature optimizations can produce counterintuitive results. So it's best not to prematurely optimize. Instead, you should always build code and use performance tuning techniques to locate performance hotspots and then improve them.

3. Micro-benchmark test as you might think.

As we saw above, checking a small piece of code is not as accurate as analyzing the overall performance of the application.

However, developers prefer to write a micro-benchmark test. It seems that tinkering with some aspects of the bottom of the platform can be fun.

Richard Feynman once said: "Do not deceive yourself, you are the most easily deceived people." "It's a good thing to say about writing a Java micro-benchmark test," he said.

Writing good micro-benchmark tests is extremely difficult. The Java platform is very complex, and many micro-benchmarks can only be used to measure instantaneous effects, or other unexpected aspects of the Java platform.

For example, if you have no experience, a micro-benchmark test that you write often is a measure of time or garbage collection, but does not capture the real impact factor.

Only developers and development teams with real needs should write a micro-benchmark test. These benchmarks should be fully disclosed (including source code) and reproducible and subject to peer review and further review.

Many optimizations in the Java platform indicate that statistical runs and single runs have a significant impact on the results. To get a true and reliable answer, you should run a single benchmark test multiple times and then summarize the results together.

If the reader feels the need to write a micro-benchmark, Georges, Buytaert, and Eeckhout, among others, use rigorous statistical methods to benchmark Java performance (statistically rigorous Java performance Evaluation) "is a good start. Without proper statistical analysis, we can easily be misled.

There are a lot of well-developed tools and communities around these tools (like Google's caliper). If it is really necessary to write a micro-benchmark, then do not write it yourself, it is necessary to peer advice and experience.

4. Slow algorithm is the most common cause of performance problems

There is a common cognitive error between developers (and so does the general public), which is that the part they control in the system is important.

This cognitive error is also reflected in the discussion of Java performance: Java developers believe that the quality of the algorithm is the main cause of performance problems. Developers consider the code, so they will naturally prefer to consider their own algorithms.

In fact, when dealing with a series of real-world performance problems, it is found that algorithmic design is less than 10% of the underlying problem.

Conversely, garbage collection, database access, and configuration errors are more likely to slow applications than algorithms.

Most applications handle relatively small amounts of data, so even if the primary algorithm is inefficient, it usually does not cause serious performance problems. To be sure, our algorithm is not optimal, however, the performance problem of the algorithm is small, and more performance problems are caused by other parts of the application stack.

So our best advice is to use actual production data to uncover the real causes of performance problems. To measure performance data, rather than guessing!

5. Caching can solve all problems

"All problems in computer science can be solved by introducing an intermediate layer. ”

David Wheeler's Programmer's motto (on the Internet, which is at least considered by two other computer scientists) is very common, especially among web developers.

If the existing architecture is not thoroughly understood and the analysis has stalled, it is often the case that the "cache can solve all problems" fallacy.

It seems to developers that instead of dealing with scary existing systems, it's better to put a cache in front of them and hide the existing systems to look for the best. No doubt, this approach just makes the overall architecture more complex, and when the next developer intends to understand the status quo of the system, the situation will be worse.

A large, poorly designed system often lacks a holistic design, written in a single line of code and a subsystem. However, in many cases, simplifying and refactoring the architecture can lead to better performance and is almost always easier to understand.

Therefore, when evaluating whether it is really necessary to join the cache, you should plan to collect some basic usage statistics (such as hit rate and misses) to prove the true value of the cache layer.

6. All applications need to focus on stop-the-world issues

There is a fact that the Java platform cannot be changed: To run garbage collection, all application threads must periodically stall. Sometimes this is considered a serious disadvantage of Java, even without any conclusive evidence.

Empirical research shows that if the number of digital data (such as price fluctuations) changes more than 200 milliseconds at a time, people will not be able to sense the normal.

The application is primarily for human use, so we have a useful rule of thumb, and Stop-the-world (STW), which is 200 milliseconds or less than 200 milliseconds, is usually not affected. Some applications may have higher requirements (such as streaming media), but many GUI applications are not required.

A few applications, such as low-latency trading or mechanical control systems, cannot accept a 200-millisecond pause. Unless this type of application is written, the user will not feel the impact of the garbage collector.

It is worth mentioning that in any system where the number of application threads exceeds the physical number of cores, the operating system must control time-sharing access to the CPU. Stop-the-world sounds awful, but virtually any application, be it a JVM or any other application, faces the contention of scarce computing resources.

If you do not measure, it is unclear what additional effects the JVM has on the application performance.

In summary, open the GC log to determine if the pause time really affects the application. Analyze the log to determine the pause time, which can be analyzed either by hand or by script or tool. Then decide if they really pose a problem for the application. Most importantly, ask yourself a key question: Do users complain?

7. The handwritten object pool is suitable for a wide class of applications

Think that the Stop-the-world pause is not a good idea in some way, a common response of the application development team is to implement its own memory management technology within the Java heap. This often boils down to implementing an object pool (even a full reference count), and any code that needs to use the domain object is involved.

This technique is almost always misleading. It is based on past perceptions, when object allocation is very expensive, and modified objects are much cheaper. The situation is completely different now.

Today's hardware is highly efficient when allocated, with the latest desktop or server hardware with a memory bandwidth of at least 2 to 3GB. This is a big number, and it's not easy to make the most of this bandwidth unless you're writing a specially written application.

In general, it is very difficult to implement the object pool correctly (especially when there are multiple threads working), and the object pool brings some negative requirements that make this technique not a common good choice:

    • All developers exposed to the object pool code must understand the object pool and be able to properly handle
    • Which code knows the object pool, which code does not know the object pool, its bounds must be known, and written in the document
    • These additional complexities are kept up-to-date and reviewed regularly
    • If there is an unsatisfied, the risk of a problem (similar to the pointer reuse in C) comes back.

In summary, object pooling can be used only if the GC pauses are unacceptable and the tuning and refactoring fail to reduce the pauses to acceptable levels.

8. In the garbage collection, relative to parallel old,cms is always a better choice

The Oracle JDK uses a parallel Stop-the-world collector by default to collect the old age, the parallel OID collector.

Concurrent-mark-sweep (CMS) is an alternative that allows the application thread to continue running in most garbage collection cycles, but this is a cost, and there are some caveats.

Allowing the application thread to run with the garbage collection thread inevitably poses a problem: The application thread modifies the object graph, which can affect the survivability of the object. This situation must be cleaned up afterwards, so the CMS actually has two STW phases (usually very short).

This brings some consequences:

    1. All application threads must be brought to a secure point, and will be paused two times per full GC;
    2. Although garbage collection is performed concurrently with the application, the throughput of the application is reduced (usually 50%);
    3. When using a CMS for garbage collection, the JVM uses much more bookkeeping information (and CPU cycles) than other parallel collectors.

These costs are not worth the price, depending on the situation of the application. But there is no free lunch in the world. The CMS Collector is commendable in design, but it is not omnipotent.

So before determining that the CMS is the correct garbage collection strategy, you should first confirm that the parallel old STW pause is really unacceptable and cannot be adjusted. Finally, I would like to highlight that all indicators must be derived from systems that are equivalent to the production system.

9. Increase the heap size can solve the memory problem

When the application is in trouble and is suspected of being a GC problem, many application teams respond by increasing the size of the heap. In some cases, this can be quick and gives us time to consider a more comprehensive solution. However, this strategy can make things worse without fully understanding the causes of performance problems.

Consider a poorly coded application that is producing many domain objects (their time-to-live is representative, say 2-3 seconds). If the allocation rate is high to a certain level, garbage collection will be carried out frequently, so that the domain objects will be promoted to the old age. Domain objects are almost in the old age, the time to live is over, and they die directly, but they are not recycled until the next full GC.

If the heap size of the application is increased, all we do is increase the space used for relatively short-lived object entry and death. This causes the Stop-the-world to pause longer and is not beneficial to the application.

It is necessary to understand the dynamics of the allocation and survival time of the object before modifying the heap size or adjusting other parameters. Blind action without measuring performance data will only make the situation worse. In this case, the old age distribution of the garbage collector is particularly important.

Conclusion

When it comes to Java's performance tuning, intuition often plays a misleading role. We need experimental data and tools to help visualize and enhance our understanding of the platform's behavior.

Garbage collection is the best example. The GC subsystem has unlimited potential for tuning or generating data that directs tuning, but for product applications it is difficult to understand the meaning of the resulting data without using tools.

By default, running any Java process, including the development environment and the product environment, should always use at least the following parameters:

-VERBOSE:GC (print GC log)
-XLOGGC: (more comprehensive GC log)
-xx:+printgcdetails (more detailed output)
-xx:+printtenuringdistribution (shows the age threshold used by the JVM to elevate objects to the old age)

Then use the tool to analyze the logs, which can be made using handwritten scripts, which can be generated from diagrams, and using Gcviewer (open source) or jclarity censum visualization tools.

9 Myths about Java performance

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.