Java Virtual machine JVM performance Optimization (i): JVM Knowledge Summary

Java Virtual machine JVM performance Optimization (i): JVM Knowledge Summary _java

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Java applications are running on the JVM, but do you know anything about JVM technology? This article (the first part of the series) describes how classic Java virtual machines work, such as: the pros and cons of Java authoring once, Cross-platform engines, garbage collection basics, classic GC algorithms, and compiler optimizations. The following article will talk about JVM performance optimizations, including the latest JVM design, to support the performance and scalability of today's high concurrent Java applications.

If you are a developer, you must have had such a special feeling that you suddenly had an epiphany, all the ideas were connected, and you could think back to your previous thoughts with a new perspective. I personally like the feeling of learning new knowledge. I've had this experience many times, when I work with JVM technology, especially with garbage collection and JVM performance optimizations. In this new Java world, I want to share my inspiration with you. Hopefully you'll be as excited as I am at writing this article to understand the performance of the JVM.

This series is written for all the Java developers who are interested in learning more about the underlying knowledge of the JVM and what the JVM actually does. At a higher level, I will discuss garbage collection and the endless pursuit of free memory security and speed without impacting application operations. You'll learn the key parts of the JVM: garbage collection and GC algorithms, compiler optimizations, and some common optimizations. I will also discuss why Java tags are so difficult, and when to give advice about when to consider test performance. Finally, I'll talk about new innovations in the JVM and GC, including the Azul ' s Zing JVM, the IBM JVM, and the focus of Oracle's garbage-one-for-one garbage collection.

I hope you've read this series with a deeper understanding of the characteristics of the Java extensibility restrictions, and the same limitations are forcing us to create a Java deployment in the best way. Hopefully you'll have an enlightened feeling and inspire some good Java inspiration: Stop accepting those restrictions and change it! If you are not an open-source worker now, this series may encourage you to move in this direction.

JVM performance and the challenge of "compile, run everywhere"

I have new news for those who stubbornly believe that the Java platform is inherently slow. When Java was just being used as an enterprise application, the JVM was criticized for the Java performance problem more than 10 years ago, but this conclusion is now obsolete. This is true, if you are running simple static and defined tasks on different development platforms now, you will most likely find that the machine-optimized code is better than the one used in any virtual environment, under the same JVM. However, Java performance has been greatly improved over the past 10 years. The demand and growth of the Java industry has led to a small amount of garbage collection algorithms, new compilation innovations, and a large number of heuristics and optimizations that have made the JVM technology progress. I'll introduce some in a later chapter.

The technical beauty of the JVM is also its biggest challenge: nothing can be considered a "compile, run Everywhere" application. Instead of optimizing a use case, an application, a specific user load, the JVM keeps track of what the Java application is doing and optimizes it accordingly. This dynamic operation leads to a series of dynamic problems. When designing for innovation (at least not when we want to be able to perform in the production environment), the developers dedicated to the JVM do not rely on static compilation and predictable allocation rates.

The cause of JVM performance

In my early work I realized that garbage collection was very difficult to "solve", and I was fascinated by JVMs and middleware technology. My passion for JVMs began when I was in the JRockit team, coding a new way to self-study and debug the garbage collection algorithm myself (reference resources). This project (transformed to JRockit an experimental feature and became the basis of the deterministic garbage collection algorithm) opened my JVM technology journey. I have worked on the BEA system, Intel, Sun, and Oracle (because Oracle has acquired the BEA system, so it has been briefly worked by Oracle). Then I joined the Azul systems team to manage the Zing JVM, and now I work for Cloudera.

Machine-optimized code may achieve better performance (but this is at the expense of flexibility), but for dynamic loading and fast-changing enterprise applications This is not a trade-off for choosing it. Most businesses are more willing to sacrifice the barely perfect performance of machine-optimized code for the benefit of Java.

1. Easy Coding and functional development (meaning is a shorter time to respond to the market)
2. A knowledgeable programmer
3. Faster development with Java APIs and standard libraries
4. Portability-No need to write Java applications for a new platform

From Java code to byte code

As a Java programmer, you may be familiar with coding, compiling, and executing Java applications. Example: We assume you have a program (Myapp.java) and now you want it to run. To execute this program you need to compile it with Javac (the JDK's built-in static Java language to the bytecode compiler). Based on Java code, JAVAC generates the corresponding executable bytecode and saves it in the same name as the class file: Myapp.class. After compiling the Java code into bytecode, you can run your application by using the Java command (either through the command line or startup script, which can be used without the startup option) to start the executable class file. So that your class is loaded into the runtime (meaning the Java Virtual machine is running), the program starts executing.

This is the scene where every application executes on the surface, but now let's explore what happens when you execute the Java command. What is a Java virtual machine? Most developers use continuous debugging to interact with the JVM--aka selecting and Value-assigning startup options can make your Java programs run faster, while avoiding notorious "out of memory" errors. But have you ever wondered why we needed a JVM to run Java apps at first?

What is a Java virtual machine?

Simply put, a JVM is a software module that performs Java application bytecode and converts bytecode to hardware, operating system-specific instructions. By doing so, the JVM allows Java programs to be executed in different environments after the first time, and does not need to change the original code. Java portability is key to the enterprise application language: Developers do not need to rewrite the application code for different platforms because the JVM is responsible for translation and platform optimization.

A JVM is essentially a virtual execution environment, as a byte code instruction machine, which is used to assign execution tasks and perform memory operations through interaction with the underlying.

A JVM also takes care of dynamic resource management for running Java applications. This means that it grasps the allocation and release of memory, maintains a consistent threading model on each platform, and organizes executable instructions in a way that is appropriate for the CPU architecture where the application executes. The JVM frees developers of references from tracking objects and how long they need to exist in the system. It doesn't need us to manage when to release memory--a non dynamic language pain point like the C language.

You can think of the JVM as an operating system that is specifically for Java, and its job is to manage the running environment for Java applications. A JVM is essentially a virtual execution environment, which is used as a byte-code instruction machine to perform tasks and perform memory operations with the underlying interaction.

JVM Components Overview

There are a lot of articles that write JVM internals and performance optimizations. As a basis for this series, I will summarize the JVM components below. This brief reading will give special help to the developers who have just come in contact with the JVM and will make you more interested in more in-depth discussions.

From one language to another--about the Java compiler

The compiler is the input of one language and outputs another executable statement. The Java compiler has two main tasks:

1. Make the Java language more lightweight, do not have to be fixed on a specific platform at the first time;

2. Ensure that valid executable code is generated for a particular platform.

The compiler can be static or dynamic. One example of static compilation is Javac. It treats Java code as input and translates it into bytecode (a language that executes in a Java Virtual machine). The static compiler interprets the input code one at a time, outputting the executable form, which is used when the program executes. Because the input is static, you will always see the same result. You can see different output only if you modify the original code and recompile.

Dynamic compilers , such as the Just-in-time (JIT) compiler, translate one language dynamically into another, which means that the code is executed when they do. The JIT compiler lets you collect or create running data analysis (by inserting the performance count), using the compiler to decide on the data at hand. A dynamic compiler can implement a better sequence of instructions in the process of compiling a language, replacing a series of instructions with more efficient, or even eliminating redundant operations. As time progresses, you will collect more code-making data, and make more and better decisions about compilation; The whole process is what we usually call code optimization and recompilation.

Dynamic compilation gives you the advantage of adjusting dynamic changes based on behavior, or new optimizations that are spawned by increased application loading times. This is why dynamic compilers are ideal for Java operations. It is noteworthy that the dynamic compiler requests external data structures, thread resources, CPU cycle analysis and optimization. The deeper the optimization, the more resources you will need. In most environments, however, the top level of performance improvement helps very little--5 to 10 times times faster than your pure explanation.

Allocation can cause garbage collection

Allocated in each thread is based on each "Java Process allocating memory address space", or called the Java heap, or simply called a heap. Single-threaded allocations are common in the Java World in client applications. However, single-threaded allocations do not benefit from the enterprise application and the work load service, as it does not use the parallel advantages of the current multicore environment.

Concurrent application design also forces the JVM to ensure that multithreading does not allocate the same address space at the same time. You can control it by putting a lock on the entire allocation space. But this technique, often called a heap lock, consumes performance, and holding or queuing threads can affect the performance of resource utilization and application optimization. The good side of multi-core systems is that they create a need for a variety of new ways to block single thread bottlenecks and serialization while allocating resources.

A common approach is to divide the heap into several parts, where each partition is sized for the application-obviously they need to be tuned, the allocation rate and the size of the object vary significantly for different applications, and the same number of threads is different. Thread local allocation cache (thread local allocation buffer, abbreviated: TLAB), or sometimes, thread native space (thread area, shorthand: TLA), is a specialized partition, In which the thread is free to allocate without declaring a full heap lock. When the area is full, the heap is full, indicating that the free space on the heap is not enough to put the object, need to allocate space. When the heap is full, garbage collection begins.

Fragments

Using Tlabs to catch exceptions is to fragment the heap to reduce memory efficiency. If an application happens to be unable to increase or completely allocate a tlab space when assigning an object, there is a risk that there will be too little space to generate new objects. Such free space is treated as "fragmentation". If the application keeps a reference to the object and then uses the remaining space to allocate it, the last space will be idle for a long time.

Fragmentation is when the fragments are dispersed in the heap-wasting the heap space with a small amount of unused memory space. assigning "wrong" Tlab space to your application (about the size of the object, the size of the mixed object, and the reference hold) is the cause of the increased fragmentation of the heap. As the application runs, the number of fragments increases the amount of space occupied in the heap. Fragmentation causes performance degradation and the system cannot allocate enough threads and objects to the new application. The garbage collector can then be very difficult to prevent out-of-memory exceptions.

Tlab waste is produced in the work. One way to completely or temporarily avoid fragmentation is to optimize tlab space for each underlying operation. The typical approach of this method is that the application will need to be tuned as long as there is an assignment behavior. A complex JVM algorithm can be implemented, and another way is to organize the heap partition to achieve more efficient memory allocation. For example, the JVM can implement Free-lists, which is a block of free memory connected to a certain size. A contiguous block of free memory is connected to another contiguous block of memory of the same size, which creates a small number of linked lists, each with its own bounds. In some cases free-lists leads to better memory allocation. A thread can allocate objects in an almost sized block, which is potentially less fragmented than you would rely on a fixed size of tlab.

GC Trivia

There are some early garbage collectors with many older years, but when more than two years of age can lead to overhead over value. Another way to optimize the allocation of debris reduction is to create the so-called Cenozoic, a dedicated heap space dedicated to allocating new objects. The rest of the heap will become the so-called old age. The old age is used to allocate objects for a long time, and objects that are supposed to exist for a long time include objects that are not garbage collected or large objects. In order to better understand this method of distribution, we need to talk about the knowledge of garbage collection.

Garbage collection and application performance

Garbage collection is the JVM's garbage collector that frees up the occupied heap memory that is not referenced. When the garbage collection is first triggered, all object references are saved and the space occupied by previous references is released or reassigned. When all recyclable memory is collected, the space waits to be crawled and reassigned to the new object.

The garbage collector will never be able to declare a reference object, which would destroy the JVM's standard specification. The exception to this rule is a soft or weak reference that can be captured if the garbage collector is about to run out of memory. I strongly recommend that you avoid weak references as much as possible, however, because the ambiguity of the Java specification leads to incorrect interpretation and usage errors. What's more, Java is designed to be dynamic memory management because you don't need to think about when and where to release memory.

A challenge for the garbage collector is to allocate memory as much as possible without affecting the running application. If you don't collect your garbage as much as possible, your application will consume near-memory, and if you collect too often, you will lose throughput and response time, which will have a bad effect on the running application.

GC algorithm

There are many different garbage collection algorithms. Later, in this series, there will be a few in-depth discussions. At the top level, the two most important methods of garbage collection are reference counting and tracking collectors.

The reference count collector tracks how many references an object points to. When an object's reference is 0 o'clock, memory is recycled immediately, which is one of the advantages of this method. The difficulty with the reference counting method is the circular data structure and keeping all references in the immediate update.

The trace collector repeats and marks all referenced objects with objects that are still referenced, using the marked object. When all still-referenced objects are marked "live," all unmarked spaces are reclaimed. This method manages the ring data structure, but in many cases the collector should wait until all the tags are complete before reclaiming the memory that is not referenced.

There is no way to be above the method. The most famous algorithm is marking or copying algorithm, parallel or concurrent algorithm. I will discuss these in a later article.

Generally, the meaning of garbage collection is to dedicate the address space to new objects and old objects in the heap. The "old object" refers to an object that survives many garbage collections. The new generation is assigned to the old objects, so that they can reduce fragmentation by quickly reclaiming short time objects that occupy memory, as well as by aggregating objects for a long time and putting them into the old age address space. All of this reduces fragmentation between the long time objects and the storage heap memory not being fragmented. One of the positive effects of the new generation is delaying the time it takes to reclaim older objects at a greater cost, and you can reuse the same space for a short object. (The old space collection will cost more, because long time objects will contain more references, and more traversal is required.) ）

The algorithm for the last value is compaction, which is the way to manage memory fragmentation. Compaction basically means moving objects together and never releasing larger contiguous memory space. If you are familiar with disk fragmentation and the tools that handle it, you will find that compaction is very similar to it, but this is not the same as running in Java heap memory. I'll discuss compaction in detail in the series.

Summary: Review and focus

The JVM allows portability (one programming, running everywhere) and dynamic memory management, and the main features of all Java platforms are the reasons for its popularity and increased productivity.

In the first article in the JVM performance tuning system, I explained how a compiler translates bytecode into the instruction language of the target platform and helps to dynamically optimize the execution of Java programs. Different applications require different compilers.

I also outlined memory allocations and garbage collection, and how these relate to Java application performance. Basically, the faster you fill up the heap and trigger the garbage collection frequently, the higher the Java application's share. A challenge for the garbage collector is to allocate memory as much as possible without affecting the running application, but before the application runs out of memory. In future articles, we will discuss traditional and new garbage collection and JVM performance optimizations in more detail.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More