Some insights into Java memory issues

Source: Internet
Author: User
Tags dynatrace

In Java, memory leaks and other memory-related issues are the most prominent in terms of performance and scalability. We have good reason to discuss them in detail.

The Java memory model-or, more specifically, the garbage collector-has overcome a lot of memory problems.

At the same time, however, new problems have been brought. In particular, with a large number of parallel users of the Java EE operating environment, memory is becoming a vital resource.

At first glance. This may seem odd because the current memory is cheap enough, and we have 64-bit JVMS and more advanced garbage collection algorithms.

Next. We will discuss the problem of Java memory in detail. These problems can be divided into four groups:

    • In Java, memory leaks are generally caused by reference objects that are no longer being used.

      When there are multiple references to the object. At the same time these objects are no longer needed, but the developers forget to clean them up, and the very easy results in a memory leak.

    • Running consumes too much memory and results in unnecessary high memory consumption. This is not common in WEB applications that manage large amounts of state information for user experience.

      As the number of active users is added. The memory reaches the upper limit very quickly. An unbound or inefficient cache configuration is another source of persistent high memory consumption.

    • When the user load is added. Inefficient object creation easy results in performance issues. As a result, the garbage collector must constantly clean up heap memory. This causes the garbage collector to have an unnecessarily high footprint on the CPU. Application response times are frequently added as the CPU is blocked due to garbage collection. Causing it to remain under medium load.

      This behavior also becomes "GC trashing".

    • Inefficient garbage collection behavior is often the result of a missing or misconfigured garbage collector. These garbage collector will always keep track of whether objects are being cleaned. However, how and when such behavior occurs must be determined by the configuration or by the program Ape , or by the system architect. Usually, people simply "forget" the correct configuration and optimize the garbage collector. I have participated in a number of symposia on "performance". Finding a simple parameter change will result in a performance boost of up to 25%.

In most cases, memory issues affect not only performance, but also scalability. The higher the amount of memory consumed per request, the fewer parallel transactions the user or session can run.

In some cases, memory issues also affect availability. When the JVM runs out of memory or is nearing the memory limit. At this time it will exit and report OutOfMemory errors. Then the manager will come to your office and you'll know you're on a big deal.

Memory problems are very difficult to solve. There are usually two reasons: first, analysis is very complex and difficult in some cases. In particular, suppose you lack the right method to solve them; second, they are generally the architectural foundation of the application. Simple code changes will not help resolve them.

To make the development process easier. I will show some of the anti-patterns that are often used in practical applications. These patterns have been able to avoid memory problems during the development process.

HttpSession as a cache

This anti-pattern refers to the misuse of the HttpSession object as a data cache. The session object exists to store information, and there is an HTTP request inside the message. This is also known as a session state. This means that the data will be saved until they are processed. These methods often exist in some important web applications. The Web application stores this information in addition to the server. There is no other way.

However, some information can be stored in a cookie, but this will have some other impact.

In a cookie, it is important to keep as few and as little data as possible.

Sometimes it's very easy to happen, and the session stores megabytes of data objects. This will immediately lead to high stack usage and memory shortages. At the same time, the number of concurrent users is very limited, and the JVM will respond to more and more users with OutOfMemoryError errors. Most user sessions also have other performance losses.

In the session replication of the cluster scenario, this adds serialization and communication work that will result in additional performance and scalability issues.

In some projects, the solution to these problems is to add a number of memory and switch to a 64-bit JVM. They cannot resist the temptation to add only a few G-sized stacks of memory. However, rather than providing a solution to the real problem, it is better to hide the phenomenon. This "solution" is only temporary, and a new question is introduced at the same time. The growing heap of memory makes it more difficult to find "real" memory problems. For such a very large heap (about 6G), most of the analysis tools available are not able to handle these memory garbage. We have invested a lot of research and development work in Dynatrace hope to be able to effectively analyze a large amount of memory garbage. As this problem becomes more and more important, a new JSR specification also mentions it.

Because the application schema is not yet understood, the session cache problem often occurs. During the development process, the data is easily and simply put into session. This is often the case. Similar to an "add and forget" approach. That is, no one can be sure that when such data is no longer needed, it is removed. In general, session data that is not required when session timeout should be processed. In an enterprise, some applications often use the session timeout in large numbers, which can cause a failure to work properly.

Also often use very high session timeouts-24 hours to provide users with an additional "experience" so that they do not have to log in again.

Give a practical example. Select the data you need from the list of databases in the session. The purpose of this is to avoid unnecessary database queries.

(Does it feel a bit premature to optimize?) )。 This causes thousands of bytes for each individual user to be placed in the session object. Although. It is reasonable to cache this information. But the user session can certainly be the wrong place.

Another example is the misuse of Hibernate session in order to manage the session state. The Hibernatesession object is simply placed into the HttpSession object for high-speed access to the database. However. This will cause a lot of other necessary data to be stored.

At the same time, the memory consumption of each user will also increase significantly.

Today, the session state of an AJAX application can also be managed in the client. This makes the service-side program stateless, or nearly stateless, and clearly has better scalability at the same time.

Thread local variable memory leak

The ThreadLocal variable is used in Java to bind variables in a particular thread. This means that each thread has its own separate instance. Such a method is typically used in a thread to process state information, such as user authorization.

However, the life cycle of a threadlocal variable is closely related to the life cycle of another thread. The forgotten threadlocal variable is very easy to cause memory problems, especially in application server.

Suppose you forget to set the threadlocal variable, especially in application server, which is very easy to cause memory problems. Application server uses the thread pool to avoid constant creation and thread destruction of constants. For example, a HttpServletRequest class gets a spare allocated thread at run time. It is passed back to the thread pool after it has run. Assuming that the application logic uses threadlocal variables and forgets to explicitly remove them, the memory is not freed.

Depending on the thread pool size-these thread pools can be hundreds of threads in a program system. At the same time. The size of the object referenced by the threadlocal variable, which can cause some problems. Like what. In the worst case scenario, a thread pool of 200 threads and a thread pool of 5M size will cause 1 GB of unnecessary memory consumption. This will immediately lead to a strong garbage collection reaction, resulting in bad response times and potential outofmemoryerror errors at the same time.

A practical example is a bug that appears in the Jbossws 1.2.0 version number (the JBossWS1.2.1 version number has been fixed)--"domutils doesn ' t clear thread locals". This problem is caused by the threadlocal variable, which refers to a 14MB parsing document.

Large temporary objects

Large transient objects can also cause outofmemoryerror errors or at least a strong GC response in the worst case. For example, suppose a very large document (XML, PDF, picture ...) Must be read and processed when. In a particular case. The application is unresponsive for a few minutes or has very limited performance and is almost not available. The root cause is that the garbage collection reaction is too strong. The following is a detailed analysis of a piece of code that reads a PDF document:

byte tmpdata[] = new byte [1024];int offs = 0;do{int Readlen = Bis.read (tmpdata, offs, tmpdata.length-offs); if (Readlen = =-1) break;offs+= readlen;if (oofs = = tmpdata.length) {byte newres[] = new Byte[tmpdata.length + 1024]; System.arraycopy (tmpdata, 0, newres, 0, tmpdata.length); tmpdata = Newres;}} while (true);

These documents are read in the same way as a fixed number of bytes. First, they are read into the middle byte array and then sent to the user's browser. However, only a few parallel requests will cause a heap overflow. Because a very inefficient algorithm is used to read the document, this can cause problems to become worse. The original idea was simply to create an initial byte array of 1KB. Let's say the array is full. A new 1KB array will be created, and the old array will be copied to the new array at the same time.

This means that when the document is read, a new array is created, and every byte read is copied to the new array at the same time.

This results in a large number of temporary objects and memory consumption that is twice the actual data size-the data is permanently copied.

Optimizing processing logic can be critical when dealing with large amounts of data. In such a case, a simple load test will show the problem.

Bad garbage collector configuration

Until now. The problems that arise in the context mentioned are basically caused by application code. However, the root cause of these causes is a garbage collector configuration error. or lost. I often see users trust their application server's default settings. At the same time, it is also believed that developers of application server know what is best for their own programs.

However, the configuration of the heap is largely dependent on the application and the actual usage scenario.

Depending on the scene to adjust the parameters, the application ability to run better. An application that runs a large number of short and persistent applications is completely different than a batch of applications that run long-term tasks. In addition The actual configuration also depends on the usage of the JVM. For IBM, the ability to make the sun JVM work properly can be a nightmare (or at least not ideal). misconfigured garbage collectors are often not immediately identified as a source of performance problems (unless you monitor the activity of the garbage collector). Often the problems we see with our eyes are too slow to respond. At the same time, understanding the relationship between garbage collection activity and response time is also not obvious. If the time of garbage collection is not related to the response time, people will generally find a very complex performance problem. Response time and elapsed time metrics issues are the main body of today's applications-for such phenomena. There is no obvious pattern in different places.

Shows the relationship between transaction metrics and garbage collection time in Dynatrace. I've found a few things about the garbage collector optimization problem.

People are going to spend a few weeks working out how to set up a solution to a performance problem in minutes.

Class Loader memory leak

When it comes to memory leaks, most people mostly think of them as objects in the heap.

In addition to objects, classes and constants are also managed in the heap.

Based on the JVM. They are placed in a specific area of the heap. For example, the Sun JVM uses so-called permanent or permgen. Typically, classes are put in heaps several times. Simply because they have been loaded with different class loader. In modern enterprise applications, the load class can have up to hundreds of MB of memory.

The key is to avoid unnecessarily adding the size of the class. A very good example is the definition of a large number of string constants-for example, in GUI applications. All of the text here is usually stored in constants. The method of using a constant string is, in principle, a good design method, and memory consumption should not be overlooked. In the real world, in an internationalized application, all constants are defined in various languages. A very humble code error can affect a class that has already been loaded. Finally, the result is. In the permanent generation of the application, the JVM will have a outofmemoryerror error. Crash at the same time.

Application server also faces an issue with class loader leaks. The main reason for these leaks is that the class loader cannot be garbage collected because an object of the class in the class loader is still alive.

As a result, these classes do not intend to release these memory footprints. And now. This problem has been very well conquered by the Java EE Application Server, which seems to be more common in today's osgi-based application environment.

Summarize

Memory issues in Java applications are generally multifaceted, and this easy leads to problems with performance and scalability. In particular, in a Java EE application with a large number of concurrent users, memory management must be a core part of the application architecture.

However, the garbage collector does not care if the unused objects are cleaned up. So developers still need proper memory management. In addition, application memory management design is a core part of application configuration.

Some insights into Java memory issues

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.