4. Memory Utilization:
The related properties of memory utilization include page scheduling or page swapping, locking, concession in thread migration, and preemptive context switching.
Memory paging occurs when the memory required for the application to run exceeds the available physical memory, and the application presents noticeable performance issues when the system is paging or using virtual memory. In order to deal with this possible situation, usually to configure the swap space for the system, swap space is generally on a separate disk partition, when the application runs out of physical memory, the operating system will be the least running parts of the application to the swap space on the disk, when access to the application of the displaced data, It must be replaced from disk into physical memory, which can have a significant impact on performance, especially the responsiveness and throughput of the application.
Concession context switching refers to the active release of the CPU by the execution thread, which means that the thread is forced to abandon the CPU or be preempted by other higher-priority threads because the allotted time slices are exhausted.
The JVM after Java5 has added a spin lock optimization mechanism, where a thread attempts to acquire a lock through a busy loop spin, and if there is still no success after several busy cycles, the thread is suspended, waiting for wake to attempt to acquire the lock again. Suspending and waking a thread can cause the operating system to have a concession context switch, so lock-competitive applications exhibit a lot of concession context switching, and the CPU clock cycle cost of a concession context switch is very high (typically up to about 80,000 clock cycles). The general principle of lock competition monitoring is that if a concession context switch takes up 5% or more of the available clock cycles, the description it encounters a lock race.
Thread migration refers to the threads that are being run migrating between processors, and most operating system CPU dispatchers assign the pending thread to the last virtual processor it runs on, and if the virtual processor is busy, the scheduler migrates the pending thread to the other available virtual processors. Thread migration can degrade application performance because the new virtual processor cache may not have the data or state information needed to run the thread. Running applications on multicore systems can lead to a large number of thread migrations, and the strategy of reducing migration is to create processor groups and assign them to those processor groups.
(1). Windows Memory Usage Monitoring:
Typeperf can monitor memory usage, and the following commands output available memory and page scheduling every 5 seconds:
Typeperf-si 5 "\memory\available Mbytes" "\memory\pages/sec"
Windows built-in tools are difficult to monitor for lock contention, and Windows performance counters can monitor context switches, but cannot differentiate between concession and preemptive context switches, so external tools such as Intel VTune or AMD Codeanalyst are required.
(2). Linux Memory Utilization Monitoring:
Linux can use the Vmstat command to monitor memory usage, the Vmstat output SI represents the amount of memory page swap, so represents the amount of memory page swap, free indicates the available idle memory.
Linux can use the Pidstat command in the Sysstat package to monitor lock contention, using the following command to monitor the concession context lock contention for all virtual processors:
Pidstat-w
The cswch/s of the pidstat-w command output is a concession-type context switch.
If the processor is 3.0GHz CPU, through the PIDSTAT-W command to monitor the discovery of a total of 1750 concession-type context switches, the concession-based context switch wasted clock cycle ratio =1750*80000/3000000000=4.7%.
5. Network I/O usage:
The performance and scalability of distributed applications is limited by network bandwidth or network I/O, for example, if the amount of messages sent to the system's network interface hardware exceeds its processing power, the message enters the operating system's buffer, causing application delays and so on.
(1). Linux network I/O monitoring:
A.netstat or Sysstat:
The number of packets sent and received per second, including errors and conflicting packages, can be provided, but network usage is not available.
B.nicstat:
Source code can be downloaded from http://sourceforge.net/projects/nicstat/files/, need to compile before use. The command format is as follows:
Nicstat [-hnsz] [-i interface[,.....]] | [Interval] [Count]
Where-H is the display Help information,-n only shows the non-local interface,-s display profile,-Z skips 0 values,-I interface is the network interface device name, interval is the frequency of the report output, count is the number of reported samples. The%util column for the output is the network usage rate.
(2). Windows network I/O monitoring:
monitoring network I/O on Windows requires knowing the bandwidth of the monitored network interfaces and the amount of data that the network interface passes.
The number of bytes passed per second by the network interface can be obtained by Typeperf-si 5 "\network Interface (*) \bytes total/sec", in bytes/s units.
Network bandwidth can be obtained by Typeperf-si 5 "\network Interface (*) \current Bandwidth", in bit/s units.
Network usage =bytes total/sec/(current BANDWIDTH/8) * 100
Or
Network Usage = (Bytes total/sec * 8)/current Bandwidth * 100
The strategy for increasing network utilization is to use non-blocking network I/O instead of blocking network I/O, and to read as much data as possible for non-blocking network I/O when reading requests, as much as possible when writing requests.
6. Disk I/O utilization
Disk I/O has a critical impact on performance for applications that frequently perform disk I/O, such as databases, logs, and so on.
Linux disk I/O monitoring:
After installing the Sysstat package on Linux, you can use the Iostat command to monitor the disk with the command: IOSTAT-XM, the%util column of the output is disk I/O utilization.
Improve disk I/O utilization policy:
(1). Use a faster storage device.
(2). File system extended to multiple disks.
(3). Use the operating system cache to turn on disk caching.
"Java Performance" note of the performance analysis basis 2