Java's performance is known as black magic. This is partly because the Java platform is very complex and difficult to locate in many cases. However, there is another trend in history. People rely on wisdom and experience to study Java performance, rather than application statistics and empirical reasoning. In this article, I want to break down some of the most ridiculous technical myths.
1. Java is slow
There are many misunderstandings about Java performance. This is the most outdated and possibly the most obvious.
Indeed, Java was sometimes slow in the 1990s s and the beginning of this century.
However, since then, the virtual machine and JIT technology have been improved for more than 10 years, and the overall performance of Java is now very good.
Among the six independent web performance benchmarks, the Java framework ranks among the top four in 22 of the 24 tests.
Although JVM uses performance profiling to optimize only common code paths, this optimization is very effective. In many cases, the Java code compiled by JIT is as fast as that compiled by C ++, and more such cases exist.
However, some people still think that the Java platform is slow, which may be due to the historical bias of those who have experienced earlier versions of the Java platform.
Before conclusion, we recommend that you maintain an objective attitude and evaluate the latest performance results.
2. A single Java code line can be viewed in an isolated way.
Consider the following short code:
Myobject OBJ = new myobject ();
For Java developers, it seems obvious that this line of code will allocate an object and call an appropriate constructor.
We may be able to launch performance boundaries accordingly. We believe that this line of code will certainly lead to a certain amount of work, based on this assumption, we can try to calculate its performance impact.
In fact, this recognition is wrong. It allows us to first consider that no matter what work, it will be done under any circumstances.
In fact, both the javac and JIT compilers can optimize the dead code. For the JIT compiler, the code can be optimized based on performance profiling data or even through prediction. In this case, this line of code will not run at all, so it will not affect the performance.
In addition, in some JVMs -- such as jrockit -- JIT compiler can even break down operations on objects, so that even if the code path is still valid, the allocation operation can be avoided.
The implication here is that the context is very important when dealing with Java performance issues, and premature optimization may result in a result that violates intuition. So it is better to optimize it too early. Instead, you should always build code and use performance tuning Technology to locate performance hotspots and then improve them.
3. The microbenchmark test is the same as you think.
As we can see above, checking a small piece of code is not as accurate as analyzing the overall performance of the application.
Despite this, developers prefer to write micro-benchmark tests. It seems that some of the underlying aspects of the Platform will bring endless fun.
Richard feiman once said, "Don't cheat yourself. You are the easiest person to cheat yourself ." This statement is not suitable for writing Java micro-benchmark tests.
It is extremely difficult to compile a good microbenchmark test. The Java platform is very complex, and many micro-benchmark tests can only be used to measure the instantaneous effect, or other unexpected aspects of the Java platform.
For example, if you have no experience, a microbenchmark is usually a test of the time or garbage collection, but you do not grasp the real influencing factors.
Only developers and development teams with actual needs should write micro-benchmark tests. These benchmark tests should be completely open (including source code) and reproducible and should be subject to peer review and further review.
Many optimizations on the Java platform indicate that statistical running and single running have a great impact on the results. To get a true and reliable answer, you should run a single benchmark multiple times and then summarize the results together.
If the reader feels it is necessary to write a micro-benchmark test, it is a good start to write a paper titled evaluating Java performance (statistically rigorous Java Performance Evaluation) using rigorous statistical methods by Georges, buytaert, and Eeckhout. We are easily misled by the lack of appropriate statistical analysis.
There are many well-developed tools and communities around these tools (such as Google's caliper ). If it is really necessary to write a micro-benchmark test, you should not write it yourself. In this case, you need comments and experiences from your peers.
4. Slow algorithms are the most common cause of performance problems.
There is a common cognitive error between developers (the same is true for the general public), that is, the part they control in the system is very important.
When discussing Java performance, this cognitive error is also reflected: Java developers think that the quality of algorithms is the main cause of performance problems. Developers consider code, so they naturally tend to consider their own algorithms.
In fact, when dealing with a series of real-world performance problems, people find that algorithm design is less than 10% likely to be a fundamental problem.
On the contrary, compared with algorithms, spam, database access, and configuration errors make applications more likely to be slow.
Most applications process a relatively small amount of data. Therefore, even if the efficiency of the main algorithms is not high, it usually does not cause serious performance problems. We can be certain that our algorithms are not optimal. Even so, the performance problems brought about by algorithms are still small, and more performance problems are caused by other aspects of the Application Stack.
Therefore, our best advice is to use actual production data to uncover the real cause of performance problems. Measure performance data instead of making guesses!
5. cache can solve all problems
"All problems in computer science can be solved by introducing an intermediate layer ."
This programmer motto of David Wheeler (on the Internet, at least two other computer scientists say) is very common, especially among web developers.
If you fail to fully understand the existing architecture and the analysis is paused, it is often the time for the paradox that "cache can solve all problems.
In the developer's opinion, it is better to add a cache to the front to hide the existing system to look forward to the best situation. Undoubtedly, this method only makes the overall architecture more complex. When the next developer to take over intends to understand the current situation of the system, the situation will be worse.
A large-scale, poorly-designed system often lacks an overall design, which is written in a single line of code and a sub-system. However, in many cases, simplifying and restructuring the architecture will bring about better performance and is almost always easier to understand.
Therefore, when evaluating whether it is really necessary to add a cache, you should first plan to collect some basic usage statistics (such as hit rate and miss rate) to prove the real value brought by the cache layer.
6. All applications need to pay attention to the stop-the-world issue.
The Java platform has an unchangeable fact: To run garbage collection, all application threads Must pause periodically. Sometimes this is treated as a serious disadvantage of Java, even if there is no real evidence.
Empirical research shows that if the frequency of digital data (such as price fluctuations) changes more than 200 milliseconds, people will not be able to perceive it normally.
Applications are mainly intended for humans. Therefore, we have a useful empirical rule that stop-the-world (STW), which is 200 milliseconds or less than 200 milliseconds, is usually unaffected. Some applications may have higher requirements (such as streaming media), but many GUI applications do not.
A few applications (such as low-latency transactions or mechanical control systems) cannot accept 200 ms of pauses. Unless this type of application is written, the user will not feel the impact of the garbage collector.
It is worth mentioning that in any system where the number of application threads exceeds the number of physical cores, the operating system must control the time-based access to the CPU. Stop-the-world sounds terrible, but in fact, any application (regardless of JVM or other applications) has to face the competition for scarce computing resources.
If no measurement is performed, it is unclear what additional impact the JVM has on the application performance.
In short, open the GC log to determine whether the pause time really affects the application. Analyze logs to determine the pause time. You can analyze the pause time manually or by using scripts or tools. Then determine whether they really bring problems to the application. Most importantly, ask yourself a key question: Do users complain?
7. The handwritten Object pool is suitable for a wide range of applications.
I think that the stop-the-world pause is not good to some extent. A common reaction of the application development team is to implement its own memory management technology in the Java heap. This often comes down to implementing an object pool (or even a full reference count), and any code that uses domain objects is involved.
This technology is almost always misleading. Based on past cognition, object allocation was very expensive at that time, while modifying objects was much cheaper. The current situation is completely different.
The current hardware is very efficient in allocation; the latest desktop or server hardware, the memory bandwidth is at least 2 to 3 GB. This is a huge number. Unless it is specially written for applications, it is not easy to make full use of such a large bandwidth.
In general, it is very difficult to correctly implement the object pool (especially when multiple threads work), and the object pool also brings some negative requirements, making this technology not a general good choice:
- All developers who have access to the code of the Object pool must understand the object pool and be able to handle it correctly.
- The code that knows the object pool and the code that does not know the object pool must be known and written in the document.
- These additional complexities need to be updated and regularly reviewed
- If one does not meet the requirements, the risk of a problem (similar to pointer reuse in c) will come back.
In short, the object pool can be used only when GC pauses are unacceptable, and adjustments and refactoring fail to reduce pauses to acceptable levels.
8. In garbage collection, CMS is always a better choice than parallel old.
By default, Oracle JDK uses a parallel stop-the-world collector to collect data from earlier years, that is, the parallel old collector.
Concurrent-mark-sweep (CMS) is an alternative. In most garbage collection cycles, it allows the application thread to continue running, but this is costly and there are some considerations.
It is inevitable that the application thread and the garbage collection thread can run together. The application thread modifies the object graph, which may affect the object storage activity. This situation must be cleared afterwards, so CMS actually has two STW stages (usually very short ).
This has some consequences:
- All application threads must be taken to the security point, and two pauses each time during full GC;
- Despite the simultaneous execution of garbage collection and applications, the application throughput is reduced (usually 50% );
- When CMS is used for garbage collection, the bookkeeping information (and CPU cycle) used by JVM is much higher than other parallel collectors.
Whether these costs are value-for-money depends on the application. But there is no free lunch in the world. CMS collectors are commendable in design, but they are not omnipotent.
Therefore, before determining that CMS is the correct garbage collection policy, we should first confirm that the STW pause of parallel old is indeed unacceptable and cannot be adjusted. Finally, I would like to emphasize that all indicators must be obtained from systems equivalent to the production system.
9. Increasing the heap size can solve the memory problem.
When the application is in trouble and suspected to be a GC problem, many application teams respond by increasing the heap size. In some cases, this can quickly take effect and give us time to consider more detailed solutions. However, if you do not fully understand the cause of performance problems, this policy will make things worse.
Consider a very poorly coded application that is generating many domain objects (their survival time is representative, for example, 2-3 seconds ). If the allocation rate is high to a certain extent, garbage collection will be performed frequently, so that domain objects will be promoted to the old age. Domain objects almost die after entering the old generation, but they will not be recycled until the next full GC.
If the heap size of the application is increased, all we do is increase the space for relatively short-lived objects to enter and die. This causes the stop-the-world to pause for a longer period, which is not beneficial to the application.
Before changing the heap size or adjusting other parameters, it is necessary to understand the dynamics of object allocation and survival time. Blind actions without measurement performance data will only make the situation worse. The distribution of the garbage collector in the old age is particularly important.
Conclusion
Intuition is often misleading when it comes to Java performance tuning. We need experimental data and tools to help us visualize and enhance our understanding of the platform's behavior.
Garbage collection is the best example. The GC subsystem has unlimited potential for tuning or generating data for guided tuning. However, it is difficult for product applications to understand the meaning of the generated data without using tools.
By default, the following parameters should be used to run any Java Process (including the development environment and product environment:
-Verbose: GC (print GC logs)
-Xloggc: (more comprehensive GC logs)
-XX: + printgcdetails (more detailed output)
-XX: + printtenuringdistribution (displays the age threshold for raising an object to an earlier age in JVM)
Then, use a tool to analyze logs. Here you can use a hand-written script to generate logs using graphs, or use a visual tool such as gcviewer (Open Source) or jclarity censum.
"No 』:The javaone 2013 conference will be held at the expo center in Shanghai on March 13, July 22-25, the content covers how to use Java SE to build modern applications, build mobile and embedded Java applications for next-generation smart devices, compile complex enterprise solutions based on Java EE, and secure and seamless cloud Environments build and deploy business applications,