Deep parsing of Java memory allocations and strings

Source: Internet
Author: User

First, the introduction question

In all data types in the Java language, the string type is a more specific type and is also a point of knowledge that is frequently asked during an interview, and this article combines the Java memory allocation depth to analyze many confusing questions about string. Here are some of the issues that will be covered in this article, which can be ignored if the reader knows all these questions.

1, Java memory specific meaning of which block of memory. Why this area of memory should be divided. How it is divided. What is the function of each block after division? How to set the size of each area.

2. The string type is less efficient than StringBuffer or StringBuilder when performing a join operation. StringBuffer and StringBuilder have any connection and difference.

3. What are constants in Java? What is the difference between string s = "S" and string s = new String ("s").

This article through the collection and collation of data, and finally written, if there are mistakes, please advise.

Second, Java memory allocation

1. Introduction to the JVM
The Java Virtual machine (Machine) is an abstract computer that runs all Java programs and is the operating environment of the Java language, one of the most compelling features of Java. The Java Virtual machine has its own perfect hardware architecture, such as processor, stack, register and so on, also has the corresponding instruction system. The JVM masks information related to the specific operating system platform, allowing Java programs to generate only the target code (bytecode) that runs on the Java Virtual machine, and can be run unmodified on a variety of platforms.
The bounden duty of a run-time Java Virtual machine instance is to run a Java program. When you start a Java program, a virtual machine instance is born. When the program closes out, the virtual machine instance dies. If you run three Java programs concurrently on the same computer, you will get three Java virtual machine instances. Each Java program runs in its own instance of a Java virtual machine.
As the following illustration shows, the JVM's architecture consists of several major subsystems and memory areas:
Garbage collector (Garbage Collection): An object that is not used in the Recycle heap memory (HEAP), that is, the objects have not been referenced.
Class loader subsystem (Classloader sub-system): In addition to locating and importing binary class files, you must also be responsible for verifying the correctness of the imported classes, assigning and initializing memory for class variables, and helping resolve symbolic references.
Execution engine (Execution Engine): Responsible for executing instructions that are contained within the method of the loaded class.
Runtime data Zone (Java Memory allocation area): Also known as virtual machine memory or Java memory, virtual machines need to divide a memory area from the entire computer memory to store many things. Examples include bytecode, other information from the loaded class file, objects created by the program, parameters passed to the method, return values, local variables, and so on.


2, Java Memory partition
As you know from the previous section, the runtime data area is Java memory, and the data area to store more things, if not this block of memory area for division management, it will appear more messy. Programs like the regular stuff, the most annoying messy things. Depending on the storage data, Java memory is typically divided into 5 areas: program counter (Programs Count Register), local method stack (Native stack), Method region (Methon area), stack (stack), Heap (Heap).
Programs Counter (Program Count Register): also called program registers. The JVM supports multiple threads running at the same time, and when each new thread is created, it will get its own PC Register (program counter). If the thread is executing a Java method (not native), the value of the PC register will always point to the next instruction that will be executed, and if the method is native, the value of the program counter register is not defined. The JVM's program counter registers are wide enough to ensure that a pointer to a return address or native is held.
Stack (stack): also called stacks. The JVM assigns a stack to each newly created thread. That is, for a Java program, its operation is done through the operation of the stack. The stack saves the state of the thread in frame units. The JVM does only two things for the stack: A frame-by-stack and a stack operation. We know that a method that a thread is executing is called the current method of this thread. We may not know that the frame used by the current method is called the current frame. When a thread activates a Java method, the JVM presses a new frame into the Java stack on the thread, which naturally becomes the current frame. During execution of this method, this frame is used to hold parameters, local variables, intermediate calculations, and other data. From this allocation mechanism in Java, the stack can also be understood: stacks (stack) is the storage area that the operating system establishes for this thread when it establishes a process or a thread (a thread in an operating system that supports multithreading), and the area has advanced features. Its related setting parameters:

-XSS--Sets the maximum value of the method stack

Local method Stack (Native stack): stores the invocation state of the local method.


Method region: When a virtual machine loads a class file, it parses the type information from the binary data contained in the class file, and then places the type information (including class information, constants, static variables, and so on) in the method area, which is shared by all threads, As shown in the following figure. There is a special area of memory in the local method area, called Chang (Constant Pool), which is closely related to string type analysis.


Heap (HEAP): The Java heap (Java Heap) is the largest piece of memory managed by a Java virtual machine. The Java heap is an area of memory that is shared by all threads. The sole purpose of this area is to hold the object instance, where almost all object instances are allocated memory, but the reference to this object is allocated in stacks (stack). Therefore, executing a string s = new String ("s") requires allocating memory from two places: allocating memory for a string object in the heap, allocating memory on the stack for reference (the memory address of the heap object, that is, the pointer), as shown in the following illustration.


The Java Virtual machine has a directive that allocates new objects in the heap without instructions to release the memory, just as you cannot explicitly release an object in the Java code area. The virtual machine itself is responsible for deciding how and when to release the memory occupied by objects referenced by the program that is no longer running, and typically the virtual machine hands the task to the garbage collector (garbage Collection). Its related setting parameters:

-XMS--Set heap memory initial size

-XMX--Set heap memory maximum value

-xx:maxtenuringthreshold--Sets the number of times the object survives in the Cenozoic

-xx:pretenuresizethreshold--Sets large objects larger than the specified size directly allocated in the old generation

The Java heap is the main area managed by the garbage collector and is therefore called the GC heap (garbage collectioned Heap). Now the garbage collector is basically the use of the generational collection algorithm, so the Java heap can also be subdivided into: the Cenozoic (young Generation) and the elderly generation (old Generation), as shown in the following figure. The idea of generational collection algorithms: The first is to scan and recycle young generation with a higher frequency, called minor collection, while checking for older objects (old generation) is much less frequent, Called Major collection. This doesn't require every GC to check all objects in memory, in order to make more system resources available for use by the system; another way of saying that a new generation is a GC (young GC) when an assigned object encounters a shortage of memory; The entire heap space and the method area are GC (full GC).


There may be a reader in question: Remember there is a permanent generation (permanent Generation) Ah, it does not belong to the Java heap. Kiss, you got the right answer. In fact, the legendary permanent generation is the method area mentioned above, it is stored in the JVM initialization when the loader loaded some type of information (including class information, constants, static variables, etc.), this information has a long life cycle, GC will not be in the main program runtime to clean up the PermGen space, So if you have a lot of classes in your application, it's likely that permgen space errors will occur. Its related setting parameters:

-xx:permsize--Sets the initial size of the perm area

-xx:maxpermsize--Sets the maximum value of the perm area

The New Generation (young Generation) is divided into: The Eden area and the Survivor District, the survivor area is divided into from and to spaces. The Eden area is where the object was originally assigned; By default, the area size of the from and to spaces is equal. When the JVM is minor GC, the surviving objects in Eden are copied to the Survivor area, and the surviving objects in the survivor area are copied to the tenured area. In this GC mode, the JVM distinguishes survivor from spaces and to spaces in order to enhance GC efficiency, thus separating object collection from Object promotion. The Cenozoic size setting has 2 related parameters:

-XMN-Sets the generation memory size.

-xx:survivorratio--setting the size ratio of Eden to survivor space

Old Generation: When there is not enough space in the older area, the JVM will be major collection in the oldest area, and if survivor and old areas are still unable to store some of the objects copied from Eden, the JVM cannot The En area creates an area of memory for the new object, an out of memory error occurs.

Three, string type of depth resolution

Let's start with the Java data type. Java data types are generally grouped into two broad categories: the underlying type and the reference type, and the underlying type's variable holds the original value, and the variable of the reference type typically represents a reference to the actual object, whose value is usually the memory address of the object. For the basic type and the subdivision of the reference type, directly above the diagram, we looked at a glance. Of course, the figure below is just one of the categories.

For the above diagram, there are 3 points to note:

Char types can be formed in a single category, and many of the basic types are classified as numeric, character (char), and bool-type.

Re

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.