Introduction
On some servers with physical memory of 8g, the main running of a Java service, the system memory allocation is as follows: Java service JVM heap size is set to 6g, a monitoring process consumes about 600m,linux itself uses about 800m. On the surface, the physical memory should be sufficient to use, but the actual operation is that there will be a large amount of swap (indicating that physical memory is not used enough) as shown in. At the same time, because swap and GC occur simultaneously can cause the JVM to be severely stuck, we have to ask: Where did the memory go?
To analyze this problem, it is important to understand the memory relationship between the JVM and the operating system. The next step is to analyze the memory relationship between Linux and the JVM.
I. Linux and Process memory models
The JVM runs on a Linux system as a process, and understanding the memory relationship between Linux and the process is the basis for understanding the relationship between the JVM and Linux memory.
The relationship between memory in hardware, system and process three levels is given.
From a hardware perspective, the memory space of a Linux system is composed of two parts: physical memory and Swap (on disk). Physical memory is the main area of memory used by Linux activities, and when physical memory is not available, Linux places some of the temporarily unused memory data into swap on disk in order to free up more free memory space, and when it is necessary to use data in swap, it must be swapped back into memory.
From a Linux system, in addition to the boot system bin area, the entire memory space is divided into two parts: kernel memory (Kernel space), user memory (users space).
Kernel memory is the memory space used by Linux itself, which is mainly used for program logic, such as program scheduling, memory allocation, connection hardware resources, and so on. User memory is provided to the main space of each process, and Linux provides the same virtual memory space for each process, which makes the process independent and non-interfering with each other. The approach is to use virtual memory technology: to give each process a certain amount of virtual memory space, and only when the virtual memory is actually used, the allocation of physical memory. As shown, for 32 of Linux systems, the 0~3G virtual memory space is generally allocated as a user space, the 3~4G virtual memory space is allocated to the kernel space, 64-bit systems are divided in a similar situation.
From a process perspective, the user memory (virtual memory space) that the process can access directly is divided into 5 parts: Code area, data area, heap area, stack area, unused area. The code area holds the machine code for the application, and the code cannot be modified during the run, with a read-only and fixed-size feature. The data area holds the global data in the application, static data and some constant strings, etc., and its size is fixed. A heap is a space for a dynamic application of a runtime program, which is a memory resource that is requested and freed directly while the program is running. The stack area is used to store data such as incoming parameters, temporary variables, and return addresses of functions. Unused extents are prestaged areas that allocate new memory space.
Second, process and JVM memory model
The JVM is essentially a process, so its memory model also has the general characteristics of the process. However, the JVM is not a normal process, it has many new features in the memory model, the main reason is two: 1. The JVM has migrated many of the things that would otherwise be part of the operating system management into the JVM, with the aim of reducing the number of system calls; 2. Java NIO, which is designed to reduce the overhead of system calls used to read and write Io. The JVM process is compared to the normal process memory model as:
It is important to note that this model is not an exact model of the JVM's memory usage, but rather the internal details of some JVMs (albeit also significant) are omitted from the operating system's perspective. The following two aspects of user memory and kernel memory explain the memory characteristics of the JVM process.
1. User memory
It is emphasized that the code area and data area of the JVM process model refer to the JVM itself, not the Java program. The normal process stack area is generally used only as a line stacks in the JVM. The difference between the JVM's heap area and the normal process is the largest, as detailed below:
The first is the permanent generation. The permanent generation is essentially the code area and data area of the Java program. Classes in Java programs are loaded into different data structures throughout the region, including constant pools, fields, method data, method bodies, constructors, and specialized methods in classes, instance initialization, interface initialization, and so on. This area is a part of the heap for the operating system, and for Java programs It is a space for the program itself and for static resources, allowing the JVM to interpret the execution of Java programs.
Next is the new generation and the old age. The new generation and the old era are the real heap space used by Java programs, mainly for memory object storage, but its management and ordinary processes are fundamentally different.
When a normal process allocates space to a memory object at run time, such as when C + + performs a new operation, it triggers a system call that allocates memory space, which is returned when the thread of the operating system allocates space based on the size of the object, and when the program releases the object, such as C + + when the delete operation is performed. A system call is also triggered to notify the operating system that the space occupied by the object is already recyclable.
The JVM uses the memory differently from the general process. The JVM applies an entire memory area to the operating system (the specific size can be adjusted in JVM parameters) as a heap of Java programs (divided into the Cenozoic and the old); When a Java program requests memory space, such as a new operation, the JVM allocates the required size to the Java program in that space. And the Java program is not responsible for notifying the JVM when it can free up space for this object, and the garbage object's memory space is reclaimed by the JVM.
The advantages of the JVM's memory management approach are obvious, including: first, to reduce the number of system calls, the JVM does not require operating system intervention to allocate memory space to Java programs, and only needs to request memory or notification collection to the operating system when the Java heap size changes. and the normal program every time the allocation of memory space will require system calls to participate; second, to reduce memory leaks, the normal program does not (or not in time) notify the operating system memory space release is one of the important reasons for memory leaks, and by the JVM Unified management, you can avoid the memory leakage problem of programmers.
The last is the unused area, which is the prestaged area for allocating new memory space. For normal processes, this area can be used for heap and stack space application and release, each heap memory allocation will use this area, so the size changes frequently, for the JVM process, the size of the heap and line stacks to use the region, and the heap size is generally less adjustment, so the size is relatively stable. The operating system dynamically adjusts the size of the area, and the area is not usually allocated actual physical memory, only allowing the process to apply for heap or stack space in this area.
2. Kernel memory
Applications typically do not deal directly with kernel memory, kernel memory is managed and used by the operating system, but with Linux's performance concerns and improvements, some new features allow applications to use kernel memory or to map to kernel space. The Java NiO is born in this context, which takes full advantage of the new features of the Linux system and improves the IO performance of Java programs.
The distribution of Linux systems in the kernel used by Java NiO is given. NIO buffer mainly includes: NiO uses a variety of channel when using the Bytebuffer, Java program actively use the Bytebuffer.allocatedirector request allocated buffer. In Pagecache, the memory used by NiO mainly includes: Filechannel.map mode to open the file occupies mapped, The cache required by Filechannel.transferto and Filechannel.transferfrom (NIO file is shown in the figure).
The use of NIO buffer and mapped can be monitored by jmx, as shown in. However, the implementation of FileChannel is using native Pagecache through system calls, and the process is transparent to Java and cannot monitor the size of this portion of memory usage.
Linux and Java NiO make room for program use in kernel memory, mainly to reduce the duplication, to reduce the overhead of IO operating system calls. For example, to send data from a disk file to a NIC, using the normal method and NIO, the data flow comparison is shown:
Copying data between kernel memory and user memory is more resource-intensive and time-consuming, and as we can see, it reduces the copy of data between kernel memory and user memory by NiO by 2 times. This is one of the important mechanisms for high performance of Java NiO (the other is asynchronous non-blocking).
As can be seen from the above, kernel memory is also very important for Java program performance, so when dividing the system memory usage, make sure to set aside some free space for the kernel.
Iii. case Study 1. Memory allocation issues
With the above analysis, omit smaller areas to summarize the memory occupied by the JVM:
JVM Memory ≈java Permanent + Java heap (new generation and old age) + thread stack + Java NIO
Back to the question raised at the beginning of the article, the original memory allocation is: 6g (Java heap) + 600m (monitoring) + 800m (System), the remaining about 600m memory is not allocated.
Now analyze the allocation of this 600m memory:
(1) Linux retains about 200m, which is part of the need for Linux to run normally,
(2) The number of threads in the Java service is 160, and the JVM default thread stack size is 1m, so using 160m memory,
(3) Java NIO buffer, which uses JMX to find up to 200m,
(4) The Java service uses a large amount of NIO to read and write files, requiring the use of Pagecache, which, as previously analyzed, is not a quantitative estimate of size.
The first three items add up to 560m, so you can conclude that Linux physical memory is not available.
Careful people will find that the introduction of two servers, a swap up to occupy a maximum of 2.16g, another swap up to occupy 871m; however, it seems that our memory gap is not that big. In fact, this is caused by the simultaneous swap and GC, which can be seen from the use of swap and the long GC at the same time.
The simultaneous occurrence of swap and GC can lead to a long GC time, a severe lag of the JVM, and, in extreme cases, a service crash. The reasons are as follows: When the JVM is in GC, it needs to traverse the used memory of the corresponding heap partition; If the GC is a part of the heap is swapped to swap, it needs to be swapped back to memory when traversing to this part, and because of insufficient memory space, You need to swap the other part of the heap in the memory to swap, so in the process of traversing the heap partition, the entire heap partition will be rotated to swap for a while (in extreme cases). The Linux recovery of swap is lagging and we see a lot of swap occupancy.
The above problems can be solved by reducing the heap size, or by increasing the physical memory.
Therefore, we come to the conclusion that the Linux system that deploys the Java service needs to avoid the use of swap in memory allocation, and how the specific allocation needs to be considered in different scenarios. JVM to Java permanent generation, Java heap (new generation and old age), line stacks, Java The need for the memory used by NIO.
2. Memory leak issues
Another case is that the 8g memory server, Linux uses 800m, the monitoring process uses 600m, the heap size is set to 4g, the system available memory is around 2.5g, but a lot of swap takes place.
Analyze the problem as follows:
(1) In this scenario, the Java permanent generation, Java heap (the new generation and the old age), the line stacks memory is basically fixed, therefore, the reason for excessive memory is located in Java NiO.
(2) According to the previous model, the memory used by Java NiO is mainly distributed in the system and Pagecache areas of the Linux kernel memory. View the monitored records, for example, we can see that the pagecache drastically shrinks before the swap occurs, that is, when the physical memory is not available. Therefore, a memory leak can be located in Java NIO Buffer in the system area.
(3) Since the directbytebuffer of NIO needs to be recycled in the late GC, it is often necessary to call System.GC () for Directbytebuffer applications, Avoid long periods of FULLGC that cause references to directbytebuffer memory leaks in the old area. In this analysis, there are two possible reasons: First, the Java program does not call System.GC () when necessary, and second, System.GC () is disabled.
(4) Finally, to troubleshoot JVM startup parameters and Java program Directbytebuffer usage. In this example, look at the JVM startup parameters and find that enabling-XX:+DISABLEEXPLICITGC causes System.GC () to be disabled.
Iv. Summary
This paper analyzes the memory relationship between Linux and JVM in detail, compares the similarities and differences between the general process and the JVM process, and understands that these features will be helpful for Linux system memory allocation, JVM tuning, and Java program optimization. Confined to the length of the relationship only listed two cases, hope to play a role.
Analysis of memory relationship between Linux and JVM