Directory
- I. Background knowledge
- Two. Introduction to JVM Virtual machine memory
- Three. Garbage collection
- Four. Some changes brought about by JAVA7 and 8
For a Java programmer, it is true that in most cases there is no need to think too much about allocating and releasing memory, nor how much understanding the JVM requires. However, in the process of writing a program is often caused by such a number of memory problems are not easily detected, and when the memory problem occurs, also can not quickly locate and solve. Therefore, understanding and mastering Java's memory management is a necessary skill for a qualified Java programmer, and only in this way can you write better programs and better optimize the performance of your program.
I. Background knowledge
According to the network can find the information and the author can hear the news, the current domestic and foreign famous several large internet companies of language selection summarized as follows:
- google:c/c++ Go Python Java JavaScript, have to mention that Google contributed to the Java Community Guava package is very high quality, it is worth learning and use.
- Youtube, watercress: Python
- Fackbook, Yahoo, Flickr, Sina: PHP (optimized PHP VMS)
- NetEase, Ali, Sohu: Java, PHP, node. js
- Twitter:ruby->java, the reason for this is that Ruby's runtime is very slow compared to the JVM. And Ruby's app is a little bit more than Java. Recently, however, Twitter has moved toward Scala.
It can be seen that while many recent years have claimed that Java has died or died in the near future, the language application share of Java has been high. Java has a GC mechanism compared to a high-performance C + +, and without that daunting pointer, the threshold for getting started is relatively low, while the scripting languages such as PHP and Ruby, which are cheaper to get started, have a performance advantage over these scripting languages (temporarily ignoring FB's own HHVM).
For Java, it ultimately relies on bytecode to run on the JVM. Currently, there are several common JVMs:
- Sun HotSpot
- BEA Jrockit
- IBM J9
- Dalvik (Android)
One of the most widely used hotspot. The latest version of Sun JDK is now 8, but given that the new JDK is not popular, this article is only for the jdk6 of hotspot virtual machines.
Two. Introduction to JVM Virtual machine Memory 2.1 Java Runtime memory area
The runtime memory composition of Java is as follows:
Some of these parts are thread-private and others are thread-shared.
The thread is private as follows:
Program counter
The line number indicator of the byte code executed by the current thread
Java Virtual Machine stack
The memory model executed by the Java method, where each method is executed, creates a stack frame that stores information such as local variable tables, action stacks, dynamic links, method exits, and so on.
- Each thread has its own stand-alone stack space
- Line stacks only basic types and object addresses are stored
- method in the local variable in the thread space
Local method Stack
Native method service. Merged into the Java Virtual machine stack in the hotspot virtual machine.
Thread sharing is as follows:
Java heap
Holds object instances, where almost all object instances and their properties allocate memory here.
Method area
Stores data such as class information, constants, static variables, JIT-compiled code, and so on, that have been loaded by the virtual machine.
Run a constant-rate pool
Part of the method area. Used to hold various literal and symbolic references generated during the compilation period.
Direct Memory
The NIO, native function directly allocates out-of-heap memory. This partial memory is also used by the Directbuffer reference.
2.2 Object Access
Java is an object-oriented programming language, so how do you access objects by reference? There are generally two ways:
Access by handle
Direct pointer
This approach is also the way that hotspot virtual machines are used.
2.3 Memory Overflow
In the process of requesting memory from the JVM, you will encounter situations where you cannot request enough memory to cause a memory overflow. Generally there are several situations:
- Virtual machine stack and local method stack Overflow
- Stackoverflowerror: The stack depth of a thread request is greater than the maximum allowed depth of the virtual machine (recursive recursion)
- OutOfMemoryError: Virtual machines in the expansion stack are unable to request enough memory space, generally can be caused by the constant creation of threads
- Java Heap Overflow: OutOfMemoryError is thrown when a large number of objects are created and the object life cycle is very long
- Run-time zone overflow: Outofmemoryerror:permgen space, a typical example of this is the Intern method of string, which triggers this memory overflow when a large number of strings use intern
- Method Area overflow: The method area holds metadata information such as class, if a large number of classes (using Cglib), then this memory overflow will be raised, Outofmemoryerror:permgen space, in the use of the framework such as hibernate can be easily caused by this situation.
Three. Garbage Collection 3.1 Theoretical basis
In general, we master the memory management of Java in order to deal with the site/service access is slow, the reasons for slow is generally the following points:
- Memory: Garbage collection consumes CPU, too much data is put in, causing memory leaks (Java is also a ^_^ of this problem)
- Thread deadlock
- I/O speed is too slow
- Other service dependencies are too slow to respond
- Slow response due to complex business logic or algorithms
Among them, the effect of garbage collection on performance is generally as follows:
- Memory leaks
- Program paused
- Program throughput drops significantly
- Slow response time
Some basic concepts of garbage collection
- Concurrent Collector: The collection can run other worker processes at the same time
- Parallel Collector: Using multiple CPUs for garbage collection
- Stop-the-word (STW): All other worker processes must be paused at the time of collection
- Sticky-reference-count: For GC using the reference count (reference count) algorithm, if an object's counter overflows, it does not work to mark an object as garbage. This error, called Sticky-reference-count problem, usually increases the number of bits in the counter to reduce the chance of this problem, but it takes up more space. In general, if the GC algorithm can quickly clean up the object, it is not easy to appear this problem.
- Mutator:mutate's Chinese is a mutation, which in the GC refers to a JVM program that specifically updates the state of an object, that is, making the object "mutated" into another type, such as becoming garbage.
- On-the-fly: The type used to describe a GC: On-the-fly reference count garbage collector. This GC uses reference counting instead of tokens to identify garbage.
- Generational GC: This is a relatively advanced GC compared to the traditional "tag-clean" technique, characterized by dividing objects into different generation, which are divided into generations, young and old. This type of GC is primarily a feature of computer programs that "the younger objects die more easily", i.e. the longer the surviving object has the chance to survive (ginger is the old spicy).
Throughput vs. Response time
Involved in garbage collection, but also to understand the meaning of throughput and response time
- Throughput is a measure of the amount of work done in a unit of time. Example: Number of Web server requests per minute
- The response time is the time used between the submission request and the response to return the request. Such as: Time spent accessing a Web page
The relationship between throughput and access time can be complex, sometimes with higher throughput at the cost of response time, and sometimes better response times at the cost of throughput. In other cases, a single change might improve both. Typically, the shorter the average response time, the greater the system throughput, and the smaller the average response time, the lower the system throughput, but the larger the system throughput, the less the average response time is, because in some cases (for example, without any additional hardware configuration), the average response time is sometimes sacrificed as the throughput increases. In exchange for a period of time to handle more requests.
For Java garbage collection, the different garbage collector affects these two metrics to varying degrees. For example: A parallel garbage collector, which guarantees throughput, sacrifices response time to some extent. While the concurrent collector, the primary guarantee is the response time of the request.
Process of GC
- Find the Living object in the heap
- Releasing resources occupied by dead objects
- Adjust the position of live objects regularly
GC algorithm
- Mark-sweep Mark-Clear
- Mark-sweep-compact Mark-Finishing
Copying Collector Replication algorithm
mark-Mark
Start scanning from "GC roots" (where roots includes line stacks, static constants, etc.) to mark objects that can be reached along the roots as "live", and eventually all reachable objects are marked "Live", while unreachable objects are "dead". The number of efficiency and survival objects is linearly correlated.
sweep-Clear
Scan the heap, locate all "dead" objects, and clean them off. The efficiency and the size of the heap are linearly correlated.
compact-compression
For object cleanup, there are some memory fragments that need to be compressed and organized. Includes: Relocate (moving the inventory objects together to free up contiguous free memory), remap (collects all object references to the new object address). The number of efficiency and survival objects is linearly correlated.
copy-replication
The memory is divided into "from" and "to" two regions, and when garbage collection, the surviving objects of the from zone are copied to the to region as a whole. The number of efficiency and survival objects is linearly correlated.
Among them, copy contrast Mark-sweep
- Memory consumption: Copy requires up to twice times the maximum live set memory; The Mark-sweep only needs to be doubled.
- Efficiency: Copy is linearly correlated with live set, high efficiency, mark-sweep is linearly related to heap size and less efficient.
Collection of generations
Generational collection is now a more advanced garbage collection program. There are several related theories
- Generational assumptions: Most objects have a very short lifespan, "dying in the face of death", focusing on the collection of young generation objects, and the younger generation usually accounts for only a small part of the space.
- Move the long-lived objects of the young generation to the old age.
- Collect only when the old age is full.
- The efficiency of collection is obviously higher than that of generation.
The collection of hotspot virtual machines is divided into one Eden area, two survivor and old generation/tenured areas, in which Eden and survivor together form new Generatiton/young space. Recycling of new generation is usually referred to as the minor GC, and the reclamation of the old generation is called the major GC, but because the major GC requires both the entire heap and the permanent in addition to the concurrent GC Generation are scanned and recycled, and are therefore referred to as full GC.
- The Eden area is the area where the objects are allocated.
- Survivor is the area where the surviving objects are stored after the Minor/younger GC.
- The tenured region stores objects that survive for long periods of time.
The typical garbage collection algorithm combinations in generational collections are described below:
- The younger generation usually collects using the copy algorithm, which stops the world
- Old age collection generally uses mark-sweep-compact, may stop the world, can also be concurrent or part concurrent.
So when do minor GC, when do major GC? The general process is as follows:
- Object completes memory allocation in Eden Space
- When the Eden space is full and the object is created, the minor GC is triggered for the new (Eden + S0 or Eden S1) generation for garbage collection because the request is not space
- Minor GC, Eden Space cannot be reclaimed objects are put into the empty survivor (S0 or S1,eden will definitely be emptied), another survivor can not be collected by GC objects will be put into this survivor, Always guarantee that a survivor is empty
- At STEP3, if the survivor area is found to be full, the objects are copied to the old area, or the survivor is not full, but some objects are old enough to be put into old space.
- When old space is filled, full GC
But this depends on which GC scheme the JVM is using.
The New generation GC has the following three types:
- Serial
- Parallelscavenge
- Parnew
For the above three GC scenarios, the GC is triggered when the Eden space is not allocated.
There are four GC types in the old generation:
For serial-old, Parallel-old, the triggering mechanism is
- Old generation Not enough space
- Permanent Generation Space Shortage
- Pessimistic strategy when Minor GC
- Allocating memory on Eden after Minor GC still fails
- When executing heap dump
- External call System.GC, can be forbidden by-XX:+DISABLEEXPLICITGC,. It is important to note that disabling System.GC () causes Oom to be used when using NIO, so this option is used sparingly. Specifically visible: http://hllvm.group.iteye.com/group/topic/27945.
For CMS, the triggering mechanism is:
- When the old generation space is used to trigger a certain ratio, the default is 92% in Hopspot V1.6, which can be viewed by printcmsinitiationstatistics (which is not available in V1.5) to see how much this value is, Cmsinitiatingoccupancyfaction to enforce the designation. The default value is calculated according to the following formula: ((100-minheapfreeratio) + (double) (cmstriggerratio* minheapfreeratio)/100.0)/ 100.0,minheapfreeratio default value is 40,cmstriggerratio default is 80.
- When Permanent Generation uses CMS collection and space usage to a certain rate trigger, Permanent generation with CMS collection needs to be set:-xx:+cmsclassunloadingenabled. The default is 92% in Hotspot V1.6, which can be cmsinitiatingpermoccupancyfraction to enforce the designation. Similarly, it is calculated according to the following formula: ((100-minheapfreeratio) + (double) (cmstriggerpermratio* minheapfreeratio)/100.0)/ 100.0,minheapfreeratio default value is 40,cmstriggerpermratio default is 80.
- The hotspot determines whether a CMS GC is required based on cost calculations, which can be-xx:+usecmsinitiatingoccupancyonly to remove this dynamically executed strategy.
- External call System.GC, and explicitgciinvokesconcurrent or explicitgcinvokesconcurrentandunloadsclasses are set.
3.2 Hotspot Garbage collector
This is the garbage collector for the hotspot virtual machine.
Serial collector
- The-XX:+USERSERIALGC parameter opens this collector
- The default collector for the new generation in client mode.
- Longer Stop the world time
- Simple and efficient
A workflow for this collector is as follows:
Before collection:
After collection:
Parnew Collector
- -xx:+userparnewgc
- +USECONCUMARKSWEEPGC is turned on by default
- Multi-threaded version of the serial collector
- The default number of threads is the same as the number of CPUs
- -xx:parrallelgcthreads number of specified threads
Compare the serial collector as shown in:
Parallel Scavenge Collector
- Cenozoic Parallel collector
- Using the copy algorithm
- The main concern is to achieve a controllable throughput, "throughput first"
- -xx:maxgcpausemillis-xx:gctimeration two parameters for precise control of throughput
- -xx:useadaptivesizepolicy GC Adaptive Adjustment Strategy
- Default Cenozoic collector for server mode
Serial Old Collector
- Serial version of the old age
- Default old age collector for client mode
- Backup plan for CMS collector, Concurrent Mode failure when used
- -XX:+USESERIALGC Turn on this collector
Parallel Old Collector
- -XX:+USEPARALLELGC-XX:+USEPARALLELOLDGC Enable this collector
- Default old age collector for Server mode
- Older versions of Parallel scavenge, using multithreaded and "Mark-sweep" algorithms
- Focus is used in throughput and CPU-resource-sensitive situations
- General use of Parallel scavenge + Parallel old to achieve maximum throughput guaranteed
CMS collector
Concurrency Low Pause collector
- -XX:USECONCMARKSWEEPGC Open the CMS collector, (default using Parnew as the young generation collector, serialold as garbage collector failed to collect)
- To obtain the shortest recovery time for the target collector, pay attention to the response speed, hope that the system pause time is shortest, and the Internet application.
Four steps:
- Initial tag Stop the world: Just mark the object that GC roots can directly relate to, fast.
- Concurrency token: GC roots tracing, concurrent with user thread
- Mark Stop the World: fix tag records that cause changes during concurrent tagging because the program continues to run
- Concurrent cleanup
Compare the serial old collector as shown in:
CMS has the following disadvantages:
- CMS is the only garbage collector that does not perform the compact, and it does not move the active object to one end of the old age when the CMS frees up the memory occupied by the garbage object.
- Very sensitive to CPU resources. does not cause the thread to pause, but causes the program to slow down and the total throughput to decrease. The more CPU cores, the less obvious.
- Unable to process floating garbage. A "concurrent Mode Failure" failure may occur, resulting in another full GC that can be adjusted-xx:cmsinitiatingoccupancyfraction to control the amount of memory consumed when the GC is triggered
- Large amount of space debris. This can be done by setting the-xx:usecmscompacatfullcollection (whether the compact is turned on at full GC) and-xx:cmsfullgcsbeforecompaction (before the compact GC Number of times)
G1 Collector
The G1 algorithm is still experimental in JAVA6 and formally introduced in Java7, but it has not been widely used in production environment. It features the following:
- Using the tag-cleanup algorithm
- does not produce fragmentation
- Predictable pause times
- Piecemeal: Dividing the entire Java heap into separate areas of equal size
- -XX:+USEG1GC can open this garbage collector
- -XX:MAXGCPAUSEMILLIS=200 can set the maximum GC pause time, and of course the JVM is not guaranteed to be able to achieve it, just try.
3.3 Tuning Experience
- Need to open GC log and read GC log:-xx:printheapatgc-xx:+printgcdetails-xx:+printgcdatestamps-xx:+printgctimestamp-xloggc:$ Catalina_base/logs/gc.log
- The best state of garbage collection is only young GC, which is the existence of objects that avoid long life cycles.
- Starting with young GC, try to give the younger generation a larger memory and avoid full GC
- Note Survivor size
- Note Memory Wall: 4g~5g
Introduction to GC Logs
1403682.561: [GC [psyounggen:1375104k->11376k (1386176K)] 4145665k->2782002k (4182400K), 0.0174410 secs] [times: user=0.27 sys=0.00, real=0.02 secs]
- 1403682.561: The time that occurs, the length of time that the JVM is running, in degrees, or formatting into a fixed time format (using-xx:+printgcdatestamps)
- Psyounggen: What type of GC is happening here, which represents a young generation of GC
- 1375104K: the size before recycling
- 11376K: the size after recycling
- Size of the 1386176k:young generation
- 4145665 K: Total footprint before recycling
- 2782002K: The size of the recycled footprint
- 4182400K: Total occupancy size
- 0.0174410: Garbage Collection Pause time
- 0.27 and 0.00: Represents CPU uptime in user and system-like (SYS)
- 0.02 secs: Represents the elapsed time of the actual GC
Note: The above actual GC run time is less than the sum of the user and system state time, is because the former only refers to the CPU running time, including waiting or IO blocking time, and now the GC is multi-threaded collection, while the machine is also multiple CPUs, so most of the sum is greater than the previous value. If you use a string-shaped collector, the time is almost different.
Advice on using the old age
- Parallel GC (-XX:+USEPARALLEL[OLD]GC)
- The minor GC time of the Parallel GC is the fastest, and the CMS's young GC is slower than Parallel because memory fragmentation
- Maximum throughput can be guaranteed
- It is really necessary to change to CMS or G1 (for Old Gen collections)
Development recommendations
- Small object allocate is very inexpensive, usually 10 CPU instructions; it is also very cheap to collect new objects; don't worry about living short small objects
- The cost of large object allocation and the cost of initialization are significant; large objects of different sizes can cause Java heap fragmentation, especially CMS, PARALLELGC or G1, and try to avoid assigning large objects
- Avoid changing the size of the data structure, such as avoiding changing the size of arrays or array backed collections/containers; it is best to explicitly set the size of the batch when the object is built (initialized); Changing the size causes unnecessary object allocations that can cause Java heap fragmentation
- Object pooling may be a potential problem
- Increases the number of live objects and may increase GC time
- Access (multi-threaded) object pooling requires locks that can cause scalability issues
- Beware of too frequent object pool access
GC's Ponzi scheme
Although GC in most cases is still normal, but sometimes the JVM will also happen to deceive your scene, the JVM kept in the garbage collection, but the heap is still full after each collection, it is clear that the program memory has been used, has not worked properly, but the JVM is not thrown outofmemoryerror (OOM) This anomaly to tell the programmer what's going on inside, just keep doing the dirty work to help us do garbage collection, the resources of the server to consume light.
A typical case of this behavior is that the GC's Gctimelimit and Gcheapfreelimit parameter settings are inappropriate. The default value of Gctimelimit is 98%, which means that if the time greater than or equal to 98% is spent on the GC, OutOfMemoryError will be thrown. Gcheapfreelimit is the size of the available heap after recycling, the default value is 2%, meaning that the GC is successful as long as there is more than 2% of the memory available. If the gctimelimit is set too large or the gcheapfreelimit is set too small then it will create a Ponzi scheme for GC, which keeps garbage collection.
Four. Some changes brought about by JAVA7 and 8
- A big change in memory that Java7 brings is that the string constant pool moves from the perm to the heap. When you call the Intern method of a string, if an object exists in the heap, the object's reference is saved directly without recreating the object.
- Java7 formally introduced the G1 garbage collector to replace the CMS.
- In Java8, the method area (permanent generation) is canceled, and the meta-space is replaced with the "meta-spatial", which is only related to system memory.
- A great optimization introduced in Java 8 update 20 is the string G1 (string deduplication) in the garbage collector. Because strings (including their internal char[] arrays) occupy most of the heap space, this new optimization is designed to enable the G1 collector to identify the duplicated strings in the heap and point them to the same internal char[] array to avoid multiple copies of the same string. The efficiency of the heap will become very low. You can use-xx:+usestringdedup
Talk about Java memory management