Maximizing Java performance on AIX, Part 4: monitoring traffic

Source: Internet
Author: User
Tags compact ftp commands disk usage maximizing Java performance on AIX, Part 4: monitoring traffic

This five-part series provides several techniques and techniques that are often used to optimize Java™ applications for optimal performance on AIX®. This article discusses the scenarios where I/O and networks can be bottlenecks.

See more in this series | 0 Reviews:

Amit Mathur ([email protected]), senior Technical consultant and solution implementation Manager, IBM

January 03, 2008

    • Content

Develop and deploy your next application on the IBM Bluemix cloud platform.

Get started with your trial


This is the fourth article in a five-part series about the Java performance optimization on AIX. It is strongly recommended that you read the 1th part of this series before proceeding further (if you have not done so).

This article discusses two other areas that can be a performance bottleneck:

    • Internet
    • Disk I/O

These two aspects are often present as AIX-specific issues and need to be optimized independently of the Java application. Therefore, instead of using the form used in parts 2nd and 3rd, this article focuses on finding the information needed to complete the optimization effort. Therefore, this article provides only a small number of tips, but we want to combine the overall performance tool discussion with a handful of tips here to provide you with enough information to get started with performance tuning.

Back to top of page

I/O and network bottlenecks

The purpose of this article is to discuss situations where I/O or networks can become bottlenecks.

If you've read each of the previous articles in this series, we want you to start to understand how each of the smaller parts fits into the global. We have tried to classify them based on the common areas of application of these techniques, but this classification is by no means mutually exclusive. For network and I/O, you will not see the actual cause of the problem so easily, but you will eventually feel the impact on your application. Only an adequate understanding of the application will guide you in determining the root cause of the problem. For example, in front of this series, we discussed the importance of ensuring that the heap is not paged. The maximum heap size specified with the-XMX switch should be less than the total amount of physical memory installed on the system (shown by "Bootinfo–r" or "Lsattr-el sys0-a Realmem" For more such commands, see "AIX Commands you should Not leave home without ".

Tools such as Topas and iostat can show the use of individual disks, but in most cases the root cause is either a GC cycle or a known functional part, or if you know your application, It should be fairly straightforward to determine the root cause of the problem. Tools such as Filemon can even tell you which files are being accessed to eliminate guesswork from optimization efforts. If your Java application performance is affected by misconfigured systems, it is time to change the focus and consider system performance optimizations instead. For example, a solution to a disk bottleneck could be to distribute the data efficiently or choose to use a higher-speed disk. This topic is beyond the scope of this article, and for more information on this topic, see the Red Book such as Understanding IBM eserver pseries Performance and Sizing.

Configuring network buffers and optimizing other network parameters can have a significant impact on network-intensive applications. A good reference to network-optimized parameters is the performance Management Guide, tunable Parameters section. Some of the popular tweaks involve Thewall, Socketthresh, Sbmax, Somaxconnect, Tcp_sendspace, Tcp_recvspace, rfc1323, and so on. This information is neither AIX-specific nor Java-specific, but for network-intensive applications this should be the first step in performance optimization.

The remainder of this section will briefly describe some common tools and how to detect Java-specific issues. For more details, see AIX 5L Performance Tools Handbook and Understanding IBM eserver pseries Performance and Sizing.


The Multi-purpose Vmstat command should already be your good friend. For I/O work, you should look at the wa(I/O wait) value in the CPU section. If this value is very high, there may be a disk bottleneck, and you can then use Iostat to view disk usage in more detail.


Iostat is the ideal tool for determining if the system has I/O bottlenecks. It shows the read and write speed for all disks. This makes it an ideal tool to determine whether you need to "scatter" disk workloads across multiple disks. The tool also reports the same CPU activity as Vmstat.

When your application is running, start from the simple iostat -s to determine what the system is doing on the whole. This command prints the following:

    TTY:      tin         tout   avg-cpu:  % user    % sys     % idle    % iowait              0.3        232.9              13.8     19.1       27.4      39.6         Disks:        % tm_act     Kbps      TPs    kb_read   kb_wrtn    hdisk0          28.7     291.4      35.0     176503   2744795    hdisk1           0.0       0.4 0.0 3537         0    hdisk7           1.7      34.9       9.8       8920    341112    hdisk14         24.5     1206.1      36.2    1188404  10904509    hdisk18          0.0       1.2       0.1      10052      2046           hdisk8 2.1      36.8      10.5      10808    357910

Review the%iowait data to determine if the system is spending too much time waiting for I/O to complete. If the system is paging, this is the data to be observed. However, it is important to note that this data alone is not sufficient to determine what is happening on the system. For example, if you write an order file in your application, the higher%iowait value is normal.

%tm_act shows the percentage of active time for a specific disk. The above trace shows a very interesting scene where%iowait is close to 40%, but Tm_act was no matter how close to 100%, but only hovering under 30%. The system on which the above tracking is made has a fibre Channel-attached storage, and the result proves that the bottleneck is the route to the SAN storage. Once it's clear, it looks pretty easy!

You can also use # iostat -at <interval> <count> or  iostat -sat ... , these two commands will give the adapter's TPS and KBPS values (as well as read and write speeds). The-s flag will provide you with overall system statistics.


For network optimization,netstat is the ideal tool. netstat -mcan be used to view mbuf memory usage, which tells you about sockets and network memory usage. If used no -o extendednetstats=1 , netstat -m more details will be displayed, but this will have a performance impact and should only be used for diagnostic purposes. When used netstat -m , the relevant information is displayed at the top of the output as follows:

      Mbufs in use:      mbuf cluster pages on use      272 Kbytes allocated to Mbufs      0 requests for Mbufs denied      0 calls to protocol drain routines      0 sockets not created because Sockthresh was reached

As well as the bottom of the output, as follows:

      Streams mblk statistic failures:      0 High priority MBLK failures      0 Medium priority MBLK failures      0 low Priori Ty MBLK Failures

If you see a failure in the netstat -m output, AIX 5L Performance Tools Handbook provides a clear description of which parameters to adjust. You may also want to try netstat -i x (replace x with the interval at which data is collected) to see network usage and packets that might be dropped. For network-intensive applications, this is the first step in checking whether "everything is OK".


Netpmon uses tracking to get the details of the network activity during a time interval. It also displays the CPU statistics for the process, which shows:

    • Total CPU time used by the process
    • CPU usage of the process (percent of total time)
    • The total time that the process spent executing network-related code

To start optimizing your work, you can try the following commands:

Netpmon-o/tmp/netpmon.log; Sleep 20; Trcstop

This command line runs the Netpmon command for 20 seconds, and then uses trcstop to stop the command and write the output to/tmp/netpmon.log. Looking at the generated data, you can see that the example we selected is well suited for the Java performance optimization article:

      Process CPU Usage Statistics:      -----------------------------                                                         Network      process (top)             PID  CPU Time   CPU%   CPU%      ----------------------------------------------------------      java                       12192    2.0277   5.061   1.370      UNKNOWN                    13758    0.8588   2.144   0.000      Gil                         1806    0.0699   0.174   0.174      UNKNOWN                    18136    0.0635   0.159   0.000      dtgreet                     3678    0.0376   0.094   0.000      swapper                        0    0.0138   0.034   0.000      Trcstop                    18460    0.0121   0.030   0.000      sleep                      18458    0.0061   0.015   0.000

Another useful part of the trace is the adapter usage:

                            -----------Xmit-----------   --------Recv---------    Device                   pkts/s  bytes/s  Util  qlen   PKTS/S  bytes/s   demux    ---------------------------------------------------------------------------- --    Token Ring 0             288.95    22678  0.0%518.498   552.84    36761  0.0222    ... Device:token Ring 0    recv packets:           11074      recv Sizes (bytes):   avg 66.5    min.      Max 1514    Sdev 15.1         recv Times (msec):    avg 0.008   min 0.005   max 0.029   sdev        0.001 demux times ( msec):   avg 0.040   min 0.009   max 0.650   sdev 0.028      xmit packets:           5788      XMit Sizes ( bytes):   avg 78.5    min      max 1514    sdev 32.0         xmit Times (msec):    avg 1794.434 min 0.083   Max 6443.266 Sdev 2013.966

Let's say you think there's too much information, or you want to see more specific information. Let's try the following command:

      Netpmon-o So-o/tmp/netpmon_so.txt; Sleep 20; Trcstop

"-O so" allows Netpmon to focus on socket-level traffic. Now we can drill down into the Java process information:

    Process:java   pid:12192    reads:                  2700      Read sizes (bytes):   avg 8192.0  min 8192    max 8192    sdev 0.0          read times (msec):    avg 184.061 min 12.430  max 2137.371 Sdev 259.156 writes    :                 3000      Write Sizes (bytes):  avg 21.3    min 5       max      Sdev 17.6         write Times (msec):   avg 0.081< C22/>min 0.054   Max 11.426  Sdev 0.211

Is it useful? Let's take a step further and find the thread-level activity. Add "-T" to the command as follows:

      Netpmon-o so-t-o/tmp/netpmon_so_thread.txt; Sleep 20; Trcstop

The generated output now contains thread-specific information, as follows:

            THREAD tid:114559    reads:                  9      Read sizes (bytes):   avg 8192.0  min 8192    max 8192          sdev 0.0 Read times (msec):    avg 988.850 min 19.082  max 2106.933 Sdev 810.518    writes:      Write Sizes ( bytes):  avg 21.3    min 5       max      Sdev 17.6         write Times (msec):   avg 0.389   min 0.059   Max 3.321   Sdev 0.977

You can now create a Java dump, see what the thread is, and determine if it works as expected. Especially for applications with multiple network connections, Netpmon allows capturing a comprehensive view of the activity.


Filemon can be used to determine which files are being used actively. This tool provides a very comprehensive view of file access and is useful for in-depth analysis after Vmstat/iostat confirms that the disk is a bottleneck. This tool also uses tracking features, so it works like Netpmon:

Filemon-o/tmp/filemon.log; Sleep 60; Trcstop

The resulting log file is quite large. Some of the areas that may be useful include:

    Most Active Files------------------------------------------------------------------------#MBs #opns #rds      #wrs file Volume:inode------------------------------------------------------------------------ 25.7 6589 0 unix/dev/hd2:147514 16.3 1 4175 0 Vxe102/dev                 /mailv1:581 16.3 1 0 4173. vxe102.pop/dev/poboxv:62 15.8 1 1 4044 tst1      /dev/mailt1:904 8.3 2117 2327 0 passwd/dev/hd4:8205 3.2 182 810    1 services/dev/hd4:8652 ...------------------------------------------------------------------------ Detailed file Stats------------------------------------------------------------------------file:/var/spool/mai l/v/vxe102 Volume:/dev/mailv1 (/var/spool2/mail/v) inode:581 opens:1 Total bytes xfrd:1       7100800 reads:           4175 (0 errs) Read sizes (bytes): avg 4096.0 min 4096 max 4096 Sdev 0.0 read times (M SEC): Avg 0.543 min 0.011 max 78.060 Sdev 2.753 ...

This tool is described in the references referenced earlier, and more detailed research is beyond the scope of this article.

Java-specific Tips

The Java common technique used to avoid I/O and network bottlenecks is in the final analysis a good design and has been documented clearly in several places. But take a look at techniques NI004 and tricks NI005.

Back to top of page

Feature-Based Optimization techniques

Let's look at the different characteristics of a typical application. You should navigate to behaviors that are similar to your application, whether designed or observed, and apply the appropriate techniques.

Back to top of page

Network-intensive applications

For network-intensive applications, you should use Netstat to ensure that there are no dropped packets, and so on. The Netstat and Netpmon sections in the AIX 5L Performance Tools Handbook Describe the various adjustments that can be performed when a failure is observed during monitoring, so there is no repetition here.

If you suspect that network throughput is a bottleneck, tip NI001 is useful for determining if there is a problem. In addition, if you do not use IPV6 at all, you can also use the tip NI002.

If you are considering application performance differences between AIX and other platforms, and you suspect that this difference is due to the socket options you set, you can look at the trick NI004.

RMI Application

If the application is an RMI client or server, you may observe that there are some rows in the VERBOSEGC output that are not described. For example, the following is excerpted from the VERBOSEGC output of an RMI application:

<GC (4057): GC cycle started Thu Apr 11:14:28 2004<GC (4057): Freed 254510616 bytes, 55% f Ree (453352000/810154496), in 1189 ms> <GC (4057): mark:991 MS, sweep:198 MS, compact:0 ms> <GC (4057): ref S:soft 0 (age >=), weak 2, final, Phantom 0> <GC (4057): Stop threads time:10, start Threads time:260> ; <GC (4058): GC cycle started Thu Apr 11:15:29 2004<GC (4058): Freed 267996504 bytes, 56% free (454445800/810154496 ), in 1243 ms> <GC (4058): mark:1041 MS, sweep:202 MS, compact:0 ms> <GC (4058): Refs:soft 0 (age >= 32 ), weak 0, final 253, Phantom 0><GC (4059): GC cycle started Thu Apr 11:16:31 2004<GC (4059): Freed 248113752 by TES, 56% Free (455754152/810154496), in 1386 ms> <GC (4059): mark:1095 MS, sweep:291 MS, compact:0 ms> <GC (4059): Refs:soft 0 (age >=), weak 0, final 263, Phantom 0> 

These GC cycles are triggered almost exactly 60 seconds apart and are not triggered by "allocation failure (Allocation Failure)". The trick NI003 may apply here after making sure that the application does not call System.GC () directly.

For RMI-intensive applications, you should consider skill NI005, but be aware of the considerations that are mentioned in this technique.

Back to top of page

Disk-intensive applications

With Iostat and Filemon, you should be able to identify the root cause of the bottleneck. The solution is typically to tweak the application design to stop relying on disk access, or to tune the system to optimize disk access. Since these two types of adjustments are beyond the scope of this article, we recommend that you familiarize yourself with Iostat and Filemon. The information in the previous section should allow you to start on the road.

Back to top of page

General skill Set

The following will refer to the Java command-line arguments (specified before the Class/jar file name) as "switches." For example, the command line java -mx2g hello has a single switch -mx2g .

Tip NIO001 Check the speed of your network connection

You can establish an FTP session between two systems that need to analyze their connection speed, and you can execute the following FTP commands:

Ftp> put "|dd if=/dev/zero bs=32k count=1000"/dev/null200 PORT command successful.150 Opening data connection For/dev /null.1000+0 Records in.1000+0 Records out.226 Transfer complete.32768000 bytes sent in 130.4 seconds (245.4 kbytes/s) Loca L: |DD if=/dev/zero bs=32k count=1000 Remote:/dev/null

The above quick test attempts to transfer 1000 0 blocks (blocks of zeroes), each with a size of three KB, and provides a simple way to determine the throughput of connections between two AIX computers. The above example shows a throughput of 245.4 KBps, which indicates a network problem because both AIX computers are using a full-duplex network adapter of up to two Mbps. If the above test shows 1.140 e+4 kbytes/s, this should be a good hint to focus on the application instead. You can change the block size and count to simulate your application behavior in more detail.

Tip NI002 IPV4 Stack

If you do not want to use IPV6 in your application, you can set the property preferipv4stack to True as follows: <classname>
Tip NIO003 Remote GC

If your application is an RMI client or server, you can also set the defined at The Sun.rmi.dgc.client.gcInterval and/or Sun.rmi.dgc.server.gcInterval properties are used for IBM Java. Both of these properties are set by default to 60 seconds and, based on the needs of the application, can be increased to reduce the performance impact of extra GC cycles.

Note that the warnings at the top of the link and the risks associated with not releasing distributed objects also apply to IBM Java.

Tip NI004 Socket Buffer size

If you are setting the send and receive buffer sizes, be aware that calls to Setsendbuffersize (int) are used only as hints. Therefore, if you observe performance differences between platforms, you should add a call to Getsendbuffersize () and see if the prompt is picked up by the current platform. In a recently reported performance issue with AIX, the application called Setsendbuffersize (4096) from its code. AIX used the hint and set the buffer size on request, while the other platform ignored the call. As a result, the perceived performance on AIX is bad! Removing this call from your code can increase the performance of the application on AIX by more than four times times.

In general, you might want to omit calls from your application to adjust the TCP/IP stack because the AIX network stack is pre-tuned.

Tip NI005 Connection Pool

For RMI-intensive applications, enabling the thread pool allows the reuse of existing connections instead of creating new connections for each new RMI invocation. To enable the thread pool, you can set the following properties:

Java-dsun.rmi.transport.tcp.connectionpool=true <classname>

You can also disable the thread pool as follows:

Java-dsun.rmi.transport.tcp.noconnectionpool=true <classname>

Note: It is best to use the thread pool only for RMI-intensive applications. The latest version of Java on AIX (1.3.1 SR7 and above, and 1.4.1 SR2 above) disables thread pooling by default.

Back to top of page


This article describes common tools and techniques for dealing with network and disk I/O bottlenecks.

The next article ends this series with a general observation and a link to useful references.


Reference Learning
    • Other parts of the article series:
      • 1th part
      • 3rd part
      • 4th part
      • 5th part

Maximizing Java performance on AIX, Part 4: monitoring traffic

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.