0. IntroductionGarbage collector long Pause, performance on the Web page may be the page response code 500 and other server error problems, if it is a payment process may lead to payment failure, will cause the company's direct economic loss, the programmer should try to avoid or reduce the occurrence of such cases.
I. Concurrency mode failure (concurrent mode failure)
concurrency mode failure log:
2016-02-21t13:53:07.974+0800:171467.254: [GC [1 cms-initial-mark:1436199k (1560576K)] 1512927K (2504320K), 0.0618140 secs] [times:user=0.06 sys=0.00, real=0.07 secs]2016-02-21t13:53:08.036+0800:171467.316: [CMS-concurrent-mark-start ]2016-02-21t13:53:08.377+0800:171467.657: [cms-concurrent-mark:0.340/0.340 secs] [times:user=2.11 sys=0.15, real= 0.34 secs]2016-02-21t13:53:08.377+0800:171467.657: [cms-concurrent-preclean-start]2016-02-21t13:53:08.385+0800: 171467.665: [cms-concurrent-preclean:0.008/0.008 secs] [times:user=0.01 sys=0.00, real=0.01 secs]2016-02-21T13 : 53:08.385+0800:171467.665: [Cms-concurrent-abortable-preclean-start]{heap before GC invocations=88667 (full): par New Generation total 943744K, used 914399K [0x0000000757000000, 0x0000000797000000, 0x0000000797000000) Eden Space 8389 12K, 100% used [0x0000000757000000, 0x000000078a340000, 0x000000078a340000) from space 104832K, 72% used [0x000000078a34 0000, 0x000000078ecf7d98, 0x00000007909a0000) to space 104832K, 0% used [0x00000007909a0000, 0x00000007909a0000, 0x0000000797000000) concurrent mark-sweep generation Total 1560576 K, used 1436199K [0x0000000797000000, 0x00000007f6400000, 0x00000007f6400000) concurrent-mark-sweep Perm Gen Total 159744K, used 78413K [0x00000007f6400000, 0x0000000800000000, 0x0000000800000000) 2016-02-21t13:53:09.535+0800: 171468.815: [gc2016-02-21t13:53:09.535+0800:171468.815: [parnewdesired Survivor Size 53673984 bytes, new threshold 6 (MA X 6)-age 1:7100568 Bytes, 7100568 total-age 2:4676456 bytes, 11777024 total-age 3:8773736 bytes, 20550760 total-age 4:7709744 bytes, 28260504 total-age 5:10891960 bytes, 39152464 total-age 6:11 735032 bytes, 50887496 total:914399k->75616k (943744K), 0.0414860 secs] 2350599k->1517225k (2504320K), 0.0417710 s ECS] [times:user=0.32 sys=0.05, real=0.05 secs]heap after GC invocations=88668 (full): par new Generation total 9437 44K, used 75616K [0x0000000757000000, 0x0000000797000000, 0x0000000797000000) Eden Space 838912K, 0% used [0x0000000757000000, 0x0000000757000000, 0x000000078a3400 XX) from space 104832K, 72% used [0x00000007909a0000, 0x0000000795378128, 0x0000000797000000] to space 104832K, 0% Used [0x000000078a340000, 0x000000078a340000, 0x00000007909a0000) concurrent mark-sweep generation Total 1560576K, used 1441609K [0x0000000797000000, 0x00000007f6400000, 0x00000007f6400000) concurrent-mark-sweep Perm Gen Total 159744K, Used 78413K [0x00000007f6400000, 0x0000000800000000, 0x0000000800000000)}2016-02-21t13:53:10.202+0800:171469.482: [ cms-concurrent-abortable-preclean:1.772/1.817 secs] [times:user=4.94 sys=0.06, real=1.81 secs]2016-02-21T13 : 53:10.204+0800:171469.483: [Gc[yg occupancy:497152 K (943744 k)]2016-02-21t13:53:10.204+0800:171469.483: [Rescan ( parallel), 1.3691900 secs]2016-02-21t13:53:11.573+0800:171470.853: [Weak refs processing, 0.1009300 secs] 2016-02-21t13:53:11.674+0800:171470.954: [Class unloading, 0.0153470secs]2016-02-21t13:53:11.689+0800:171470.969: [Scrub symbol table, 0.0110770 secs]2016-02-21t13:53:11.700+0800: 171470.980: [Scrub string table, 0.0016360 secs] [1 cms-remark:1441609k (1560576K)] 1938761K (2504320K), 1.5079530 secs] [T imes:user=13.01 sys=0.08, real=1.51 secs]2016-02-21t13:53:11.712+0800:171470.992: [CMS-concurrent-sweep-start]{ Heap before GC invocations=88668: par new generation total 943744K, used 914528K [0x0000000757000000, 0x000000 0797000000, 0x0000000797000000) Eden Space 838912K, 100% used [0x0000000757000000, 0x000000078a340000, 0x000000078a340000) from space 104832K, 72% used [0x00000007909a0000, 0x0000000795378128, 0x0000000797000000] to SPAC E 104832K, 0% used [0x000000078a340000, 0x000000078a340000, 0x00000007909a0000) concurrent mark-sweep generation Total 1 560576K, used 1441606K [0x0000000797000000, 0x00000007f6400000, 0x00000007f6400000) Concurrent-mark-sweep Perm Gen Total 159744K, used 78413K [0x00000007f6400000, 0x0000000800000000, 0x0000000800000000) 2016-02-21t13:53:11.775+0800:171471.055: [gc2016-02-21t13:53:11.775+0800:171471.055: [ParNew ( Promotion failed) desired survivor size 53673984 bytes, new threshold 6 (max 6)-age 1:4030872 Bytes, 4030872 tota L-age 2:5959704 bytes, 9990576 total-age 3:4628680 bytes, 14619256 total-age 4:8773080 bytes, 23392336 total-age 5:7707144 bytes, 31099480 total-age 6:10890224 bytes, 41989704 total:914528k->9073 44K (943744K), 1.0312010 secs]2016-02-21t13:53:12.807+0800:171472.086: [cms2016-02-21t13:53:14.455+0800:171473.735 : [cms-concurrent-sweep:1.684/2.743 secs] [times:user=3.69 sys=0.36, real=2.74 secs] (Concurrent mode failure): 1451903K ->475795k (1560576K), 3.9644230 secs] 2356134k->475795k (2504320K), [CMS perm:78413k->78413k (159744K)], 4.9959570 secs] [times:user=5.65 sys=0.34, real=5.00 secs]heap after GC invocations=88669 (full): par new generation Total 943744K, used 0K [0x0000000757000000, 0x0000000797000000, 0x0000000797000000) Eden Space 838912K, 0% used [0x0000000757000000, 0x0000000757000000, 0x000000078 a340000) from space 104832K, 0% used [0x000000078a340000, 0x000000078a340000, 0x00000007909a0000] to space 104832K, 0% used [0x00000007909a0000, 0x00000007909a0000, 0x0000000797000000) concurrent mark-sweep generation Total 1560576K, US Ed 475795K [0x0000000797000000, 0x00000007f6400000, 0x00000007f6400000) concurrent-mark-sweep Perm Gen Total 159744K, Used 78413K [0x00000007f6400000, 0x0000000800000000, 0x0000000800000000)}
Two reasons:
- In the process of CMS startup , the new generation speed is too fast, the old age collection speed cannot catch up with the new generation speed
- In the process of CMS startup, the old age fragmentation is serious, unable to accommodate the new generation of large objects lifted up
In this case, the application thread will stop all (equivalent to the time the site is unable to respond to user requests) for compressed garbage collection (fallback to the Serial old algorithm)
Workaround:
- The new generation of rapid Ascension Problem: (1) If the frequency is too fast, indicating that the space is insufficient, first of all, you can try to adjust the new generation of space and promotion threshold. (2) If memory is limited, you can set the CMS garbage collection to start to reduce the frequency of the problem when the age-to-age ratio is reached (the sooner the startup problem occurs, the lower the throughput, the more specific adjustments to find the balance), the parameters are as follows: If there is no second parameter, the CMS is dynamically adjusted with the JVM. Start time
-xx:cmsinitiatingoccupancyfraction=68 (default is-xx:+usecmsinitiatingoccupancyonly)
- The old age of debris serious problems: (1) If the frequency is too fast or the full GC after the space release is not enough, the space is insufficient, first of all, you can try to adjust the old age space (2) If the memory is not enough, you can set the N-th CMS after a compressed full GC, the parameters are as follows:
-xx:+usecmscompactatfullcollection: Enable compressed gc-xx:cmsfullgcbeforecompaction=n when full GC is performed n-times, after a CMS, a compressed full GC To reduce the fragmentation generated by the CMS
two. Upgrade failure (promotion failed)in the Minor GC process,Survivor Unusedmay not be sufficient to accommodate the surviving objects in Eden and another Survivor, then the excess will be moved to the old age, calledPremature Ascension(Premature Promotion). This can lead to the growth of short-lived objects in the old age and may cause serious performance problems. Further, if the old age is full, the Minor GC will be followed by the full GC, which will result in traversing the entire heap, calledelevation Failure(Promotion Failure).
Elevation Failure log:
2016-01-07t18:54:26.948+0800:18782.967: [gc2016-04-07t18:54:26.948+0800:18782.967: [ParNew (Promotion failed) Desired survivor size 117833728 bytes, new threshold (max)-age 1:6141680 Bytes, 6141680 total-age 2: 6337936 bytes, 12479616 total-age 3:549120 bytes, 13028736 total-age 4:87768 bytes, 13116504 total -Age 5:221384 Bytes, 13337888 total-age 6:934168 bytes, 14272056 total-age 7:146072 bytes, 1 4418128 total-age 8:626064 bytes, 15044192 total-age 9:398000 bytes, 15442192 total-age 10:4,296 Bytes, 15871808 total:1969227k->1929200k (2071808K), 0.7452140 secs]2016-01-07t18:54:27.693+0800:18783.713: [ cms:1394703k->632845k (2097152K), 4.0993640 secs] 3301676k->632845k (4168960K), [CMS perm:77485k->77473k ( 159744K)], 4.8450240 secs] [times:user=5.18 sys=0.56, real=4.84 secs]heap after GC invocations=5847 (full 7): par new Gen Eration total 2071808K, used 0K [0x00000006e9c00000, 0x0000000776400000, 0x0000000776400000) Eden Space 1841664K, 0% used [0x00000006e9c00000, 0x00000006e9c00000, 0x000000075a280000) from space 230144K, 0% used [0x0000000768340000, 0x0000000768340000, 0x0000000776400000] to SPAC E 230144K, 0% used [0x000000075a280000, 0x000000075a280000, 0x0000000768340000) concurrent Mark-sweep Generation Total 2 097152K, used 632845K [0x0000000776400000, 0x00000007f6400000, 0x00000007f6400000) concurrent-mark-sweep Perm Gen Total 159744K, used 77473K [0x00000007f6400000, 0x0000000800000000, 0x0000000800000000)}
Promotion failure Reason: Minor GC when found Survivor space, and the old age of idle is not enough
- The new generation is lifting too fast
- There are too many fragments in the old age, not enough to lift the large object (the old age still has a lot of space but, there is promotion failed)
Workaround: Two and above concurrent mode failure as another, because Survivor Unused insufficient, then you can try to adjust the Survivor to try
three. Other system activity impacts during GCThere are times when system activities such as memory swap out (VMSTAT), network activity (netstat), I/O (IOSTAT) occur during GC, which can cause GC time to grow longer.
If you have a swap area on your server (as you can see with top, vmstat, and so on) for swapping in memory, the operating system may swap inactive memory pages in the JVM to swap areas to free up memory for thread use (which also reveals that memory is not starting to work). In-memory swap-out is a costly disk operation that is several orders of magnitude slower than memory access.
Watch a GC log: 29.47 seconds
{heap before GC invocations=132 (full 0):p ar new Generation total 2696384K, used 2696384K [0xfffffffc20010000, 0XFFFFFFFC e0010000, 0xfffffffce0010000) Eden Space 2247040K, 100% used [0xfffffffc20010000, 0xfffffffca9270000, 0xfffffffca9270000) from space 449344K, 100% used [0xfffffffca9270000, 0xfffffffcc4940000, 0xfffffffcc4940000] to space 449344K, 0% used [0xfffffffcc4940000, 0xfffffffcc4940000, 0xfffffffce0010000) concurrent mark-sweep generation Total 9437184K, used 1860619K [0xfffffffce0010000, 0xffffffff20010000, 0xffffffff20010000) Concurrent-mark-sweep Perm Gen Total 1310720K, used 511451K [0xffffffff20010000, 0xffffffff70010000, 0xffffffff70010000) 2013-07-17t03 : 58:06.601-0700:51522.120: [GC before GC:: 2696384k->449344k (2696384K), 29.4779282 secs] 4557003k->2326821k ( 12133568K), 29.4795222 secs] [times:user=915.56 sys=6.35, real=29.48 secs]
Then look at the vmstat command in the SI, so column values, if the value of a large change in the swap out serious, this is the performance of low memory.
WORKAROUND: Reduce the number of threads, which will reduce memory swap out, increase memory, and if the JVM memory settings are too large to cause the thread to run out of memory, adjust the-xmx and-xms appropriately.
Four. Bugs in the JVMThis is based on the JDK version you use to check, if it is a JVM bug, upgrade to a resolved version will generally be resolved.
Five. Summary
Troubleshooting and solving long-time pauses need certain information and methodology first:
- Verbose GC Log
- Monitor the system with commands such as Iostat, Vmstat, netstat, mpstat, etc. under the Linux platform
- Using Gchisto this GC graphical user interface tool, you can count the Minor GC and full GC frequency distribution in a timely manner, reference to: http://blog.csdn.net/wenniuwuren/article/details/50760259
- Check the GC log for the typical memory anomalies described above (promotion failed, concurrent mode failure), in general, the above two typical memory anomalies are controlled at an acceptable occurrence frequency, the CMS The problem of fragmentation is not likely to be eliminated, but only by G1.
- is not caused by a bug in the JVM itself
- If the program is not a problem, the parameters are not resolved a few times, it may indicate that the flow is too large, need to add the machine to spread the pressure on more JVMs
six. ReferencesThe authoritative guide to Java performance optimization (https://blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses)
———————————— If the article helps you, click on "Top" to let the article help more people ——————————————
Long-time pause problem of JVM tuning--GC and its solving method