http://www.ibm.com/developerworks/cn/aix/library/es-Javaperf/es-Javaperf1.html maximizing Java performance on AIX, part 1th: Basics
This five-part series provides several techniques and techniques that are often used to optimize Java™ applications for optimal performance on AIX®. It also provides a discussion of the applicability of each technique. With these tips, you should be able to quickly optimize your Java environment to suit the needs of your application.
Introduction
There are several performance optimization tools available for the IBM eserver pseries platform that runs AIX. The Java implementation of IBM running on AIX also contains several tweaks, most of which have fairly clear documentation. However, the IBM support team is still experiencing a number of situations where performance optimization is affected by the separation between the two sets of tools.
This series of articles shows you how to use Aix and Java tools together to maximize the performance of Aix-based Java applications. Part 1th ("Foundation") discusses prerequisites for successful optimization and outlines tools that can be used to assist in such work. It is highly recommended that you read this article completely, as it will reduce the amount of trouble for you in the future.
The next three articles in this series focus primarily on specific aspects of performance optimization. The 2nd part ("Speed Requirements") discusses how to improve execution speed and throughput. The 3rd part ("More is Better") looks at sizing work and illustrates how to operate the memory in a favorable way. The 4th part ("Monitor traffic") studies the network and disk I/O as the target of performance optimization. These three articles are written to facilitate quick search, so you are welcome to use these articles as a quick reference rather than as a read-through. Each of these articles also discusses general techniques and adjustments that we consider useful in practical work.
Part 5th ("References and conclusions") concludes this series by discussing additional sources of information about this topic.
Please note that this article is limited to J2SE information; for Java EE-specific optimizations, refer to the companion documentation for the Java EE component. If you are using a Java-bound application, you should refer to the application documentation to ensure that any changes you make do not affect application functionality.
Back to top of page
Before you begin
Any type of optimization must take into account a number of matters, some of which are contradictory in nature. While each of these types of work is unique, there are some common and almost always helpful steps. Here is a brief list of things to be prepared before considering performance optimizations. If there are any steps that cannot be completed before the optimization begins, there should be a good reason for doing so.
Migrating to the latest version
For a list of the latest available Java versions on AIX, see IBM Developer kits for AIX, Java Technology Edition. Newer versions of Java contain performance enhancements that are often of significant importance. For example, Java 1.4 has a much stronger garbage collection implementation than the previous version.
Note: This series of articles is written in 2004, so most of the examples in this series are running on the two versions of JDK 1.4 and JDK 1.3.1, please read on.
You should migrate to the latest available services update (service REFRESH,SR), even if you are constrained by the use of a specific Java version requirement. This is a quick way to benefit from functional and performance fixes. In addition, if you encounter any type of problem, you will need to migrate to the latest SR to get support. Another example of an upgrade to the latest SR is a switch that is added after Java 1.3.1 has been published -Xdisableexplicitgc
, so you cannot use these switches unless you migrate to a specific version.
This article focuses on Java 1.4 and Java 1.3.1 environments as an example, because older versions either end the service or end the service soon. For the same reason, we will take the Aix 5L (specifically AIX 5.1 and AIX 5.2) environment as an example, not AIX 4.3.3 or lower.
If the configuration of your application is feasible, you may also want to consider migrating to a 64-bit version of Java. This topic is beyond the scope of this series; Unless otherwise noted, we will limit the discussion to 32-bit Java. Stay tuned for an article on 64-bit Java that will be available soon in ESDD.
Fix the problem before tuning
If you encounter issues such as crashes, memory leaks, or application hangs, you are not ready to optimize your work. This includes situations where you have disabled some or all of the Just-in-time (JIT) compilers. If a java_compiler or jitc_compileopt environment variable is specified in the environment, the application may have suffered a performance penalty (note that you can also use JITC_COMPILEOPT to optimize application performance.) Before you begin performance tuning, you should have fixed the issue, whether by upgrading to the latest service refresh or by engaging with the IBM Java service team. The Java Diagnostics Guide on IBM Developer Kits-diagnosis documentation provides excellent advice on how to debug any problems encountered by IBM Java.
Verify your environment
You typically do not need to worry about any specific environment settings (the JIT settings mentioned in the previous section, and the Ldr_cntrl settings required for heap size modification, are notable exceptions), because the Java launcher sets the environment itself. However, if your application uses JNI, you must set the correct environment for the JNI component. Each Java version Companion SDK Guide or Readme file provides a list of variables that need to be set to a specific value, where a mismatch can cause performance and/or functionality problems.
Do you want to optimize for Java?
While it's a good idea to study the various techniques and tweak the Java application environment, you can only see the performance impact after removing the other bottlenecks. For example, if Java does not appear in the top 5 processes that consume the most memory, if the CPU usage of other processes on the system remains at 100%, the changes you make to Java are likely to not work.
Back to top of page
Can it run faster?
Performance This wording is multifaceted, and every corner needs to make a trade-off. To exclude guesswork from any such decision, you need to be fully aware of the expected behavior to provide support. You must also have a well-defined mechanism at your disposal to determine the impact of any changes you make. Whatever the scale of the optimization effort, the steps described here will make the optimization work more efficient.
Understand your application's characteristics
Before you begin to optimize your application, you must understand how the application is expected to work. In addition to generalized classifications such as client-server topologies or graphical user interface (graphical user Interface,gui) applications, you should also understand how the application code works internally to achieve the tasks it is trying to accomplish. For example, we can show you how to calculate the percentage of CPU used during a particular run, but the observed number is unique to each application. You should be able to distinguish between "normal" and "abnormal" observations in order to determine what objects to optimize.
Note that you do not need to access the Java source code to perform most of the optimizations described in these documents. This does not mean that you should treat your application as a black box. Is your application designed to get work done quickly and quit, or will it continue to run for a long time (for example, as a server)? Is it a large amount of memory allocated during startup, or is it only consuming a small amount of memory at startup? Does it perform a lot of recycling, or does it occupy an allocated object? Questions like that will determine your optimization effort.
Select a good test suite
We strongly recommend that you prepare some way to quantify the benefits you have achieved. A prerequisite for using any of these tools is the existence of a repeatable, verifiable, and reliable test suite that allows you to examine as many application features as possible. Remember, in some cases, one aspect of adjusting run-time behavior can negatively affect another. Only good test suites allow you to understand the tradeoffs between runtime performance and memory footprint (for example).
Establish benchmarks
Before making any changes to the system, you should take the time to establish a clear and unambiguous way to determine the effect of any changes you make. This method can be as simple as using the "time" command, or more granular, such as using a script that simulates 1000 users to measure response time. Either way, the workloads used to measure performance adjustments should be sufficiently diversified to examine as many different scenarios as possible. In addition to any type of external measurement, we recommend using the AIX Performance PMR data collection tool (PERFPMR).
The PERFPMR tool is actually a data collection tool used by AIX and other service teams. It creates a snapshot of the system within a specified time period, providing a clear indication of what the system "does" during that time period. Unlike running each AIX tool separately, you can ask PERFPMR to run those tools for you directly. Because the methods for creating these snapshots are well-defined, you can create multiple snapshots at different points in the optimization cycle to track optimization progress. Please note that we have provided examples of specific tools instead of PERFPMR in this series.
Back to top of page
Tools to complete the optimization
There is a rich set of tools available to monitor all aspects of the AIX system. Table 1 provides a brief description of each tool. These tools can be used to monitor only a single process or monitor the entire system. AIX 5L Performance Tools Handbook and Understanding IBM eserver pseries Performance and Sizing and their references are a good starting point for understanding the various tools mentioned below.
System Monitoring tools on table 1:aix
CPU |
Memory Subsystem |
I/O subsystem |
Network Subsystem |
Vmstat Iostat Ps Sar Tprof |
Vmstat LSPs Svmon Filemon |
Iostat Vmstat LSPs Filemon Lvmstat |
Netstat Ifconfig Tcpdump |
Each of these tools is briefly described in a subsequent article. There are also tools that cannot be grouped into a particular category, and they are briefly discussed below.
Topas
Topas is a useful graphical interface that can provide you with instant results of what is happening on your system. When you run this tool without any command line arguments, the screen displays the following:
Topas Monitor for host:aix4prt events/queues File/ttymon Apr 16:16:50 2001 Interval:2 C Switch 5984 readch 4864 Syscall 15776 Writech 34280Kernel 63 .1 |################## | Reads 8 Rawin 0User 36.8 |########## | Writes 2469 ttyout 0Wait 0.0 | | Forks 0 igets 0Idle 0.0 | | Execs 0 Namei 4 runqueue 11.5 dirblk 0Network K BPS i-pack o-pack kb-in kb-out waitqueue 0.0lo0 213.9 2154.2 2153.7 107.0 106.9tr0 34.7 16.9 34.4 0.9 33.8 PAGING MEMORY faults 3862 REAL,MB 1023Disk busy% KBPS TPS kb-read Kb-writ steals 1580% Comp 27.0hdisk0 0.0 0.00.0 0.0 0.0 pgspin 0% noncomp 73.9 pgspout 0 Cl Ient 0.5Name PID CPU% pgsp Owner pagein 0java 16684 83.6 35.1 root Pageout 0 PAGING Spacejava 12192 12.7 86.2 root Sios 0 SIZE,MB 512lrud 1 032 2.7 0.0 root% used 1.2aixterm 19502 0.5 0.7 root NFS (ca LLS/SEC)% free 98.7topas 6908 0.5 0.8 root ServerV2 0ksh 18148 0.0 0.7 root ClientV2 0 Press:gil 1806 0.0 0.0 root ServerV3 0 "h" for help
The information on the lower left side shows the most active process; here, Java consumes 83.6% of the CPU. The right-hand area in the middle shows the total physical memory (1 GB in this case) and paging space (in megabytes), and the capacity being used. So you can get a clear overview of what the system is doing on a single screen, and then you can choose the areas you want to focus on based on the information that appears.
Trace
Trace captures the sequential flow of system events with timestamps. Trace is a valuable tool for observing system and application execution. Many other tools provide abstract information such as CPU and I/O utilization, while the Trace tool helps detail information about where the event occurred, what process was responsible for the event, when the event occurred, and how they affected the system. Two post-processing tools that can extract information from trace are utld(in Aix 4) and Curt (in Aix 5). These tools provide statistical information about CPU utilization and process/thread activity. The third post-processing tool is Splat, which represents the "Simple Performance Lock analysis tool" (Easy performance lock analytical tools). This tool is used to analyze the lock activity of simple locks in the AIX kernel and kernel extensions.
Nmon
Nmon is a free software tool that provides many of the same information as Topas , but saves the information in Lotus 123 and Excel format files. The download site for this tool is http://www.ibm.com/developerworks/eserver/articles/analyze_aix/. The information collected includes CPU, disk, network, adapter statistics, kernel counters, memory, and "busiest" process information.
Platform-agnostic Java performance monitoring
The Java hypervisor interface (Java Virtual machine Profiling Interface,jvmpi) is supported by IBM Java and is useful for all aspects of performance monitoring. You can use a third-party profile or use the IBM Java Profile interface to perform Java performance monitoring. For more detailed information, see the Java Diagnostics Guide in IBM developer Kits-diagnosis documentation -Xhprof
. For information specific to profiling on AIX, you should also refer to the Readme file/sdk Guide in IBM developer kits for AIX, Java Technology Edition. For example, AIXTHREAD_ENRUSG=ON
thread CPU time is not reported during profiling unless you set an environment variable. This is documented in all Java versions of the Readme file/sdk Guide.
Perhaps the most common type of Java performance monitoring is the VERBOSEGC log. More details on how to parse the VERBOSEGC log can be found in fine-tuning Java garbage collection performance. Although enabling VERBOSEGC tracing does result in a small performance impact, the benefits of the resulting analysis check will undoubtedly outweigh the performance penalty. Fine-tuning Java garbage collection performance mentions how optimizations are performed based on VERBOSEGC output.
Back to top of page
Default Java behavior on AIX
This section describes the current status of each setting. These settings will change in most cases over time. The SDK's Companion Readme and SDK guides are always the latest resources for this type of setup.
Java uses the following environment settings:
- Aixthread_scope=s
This setting is used to ensure that each Java thread maps to a kernel thread on a one-to-one. The advantages of this approach can be seen on a variety of occasions, and a notable example is how Java leverages dynamic logical partitioning (Logical partitioning,dlpar), and when a new CPU is added to the partition, a Java thread can be dispatched to that CPU. You should not change this setting under normal circumstances.
- Aixthread_cond_debug, Aixthread_mutex_debug and Aixthread_rwlock_debug
These flags are used for kernel debugging purposes. These flags may sometimes be set to OFF. If this is not the case, closing them will provide a good performance boost.
- ldr_cntrl=maxdata=0x80000000
This is the default setting on the Java 1.3.1 and controls how large Java heap growth is allowed. Java 1.4 Determines the Ldr_cntrl settings based on the heap requested. For more information about how to manipulate this variable, see Getting more memory in the AIX for your Java applications.
- Java_compiler
This setting determines the Just-in-time compiler that will be used. The default setting is JITC and points to the IBM JIT compiler. You can change it to JITCG, which represents the debug version of the JIT compiler, or to NONE, which means that the JIT compiler is turned off (this is definitely the worst performance in most cases).
- Ibm_mixed_mode_threshold
This setting determines how many times the JVM JIT calls the method before compiling a method. This setting varies depending on the platform and version, for example, for Java 1.3.1 on AIX, this setting is 600.
Note that any one of these settings does not overwrite the existing settings. For example, if you change ldr_cntrl=maxdata to some other value, the value you specify is used instead of the default value mentioned above.
The companion Readme file for the IBM Java SDK/sdk Guide explains the environment settings that any Java Native interface (Java Native interface,jni) library must have. If you modify any of the environment settings specified in the list, you must make sure that you also use the appropriate settings to generate the JNI library.
Back to top of page
Summarize
This article describes some basic steps that can be used as a checklist for starting an optimization effort. The next three sections will look at CPU, memory, network, and disk I/O optimizations
Maximizing Java performance on AIX, part 1th: Basics