JVM Learning (2)-what is often said in the technical articles, stacks, stacks, what the stack is, summarizes from the OS perspective

Source: Internet
Author: User

As the saying goes, the code that you write, 6 months later is someone else's code ... Review! Review! Review! The knowledge points involved are summarized as follows:

    • Stacks are stacks
    • JVM stack and local method stack partitioning
    • Heap, stack, and Heap in Java, stack
    • Data structure-level heap, stack
    • OS-level heap, stack
    • How the JVM's heap, stack, and OS correspond
    • Why the call to the method requires a stack

belongs to menstruation question, just met someone to ask me this kind of comparative basic knowledge, helpless my consciously answer not have effect, now in the summary of the following:

The previous article summarizes: The JVM's memory is divided into 3 partitions
  • Heap--only the objects (arrays) themselves (data of reference types), and no references to primitive types and objects exist. The JVM has only one heap, which is a heap in the sense of dynamic memory allocation-the area of memory used to manage the dynamic life cycle. The heap of the JVM is shared by all Java threads in the same JVM instance, usually managed by some sort of automatic memory management mechanism, often called "garbage collection" (Garbage COLLECTION,GC). The JVM specification does not enforce which GC algorithm is required by the JVM implementation.
  • Stack--only objects and object references of the underlying data type are saved in the stack. Each thread is a stack area, and the data in each stack is private and the other stacks are inaccessible. There is a frame within the stack (the method call generates a stack frame) divided into three parts: the basic type variable area, the execution environment context, the operation instruction area.
  • the method area-also called the static zone-is shared by all threads, as in the heap. The method area contains all class and static variables. a method area contains elements that are always unique throughout the program. such as: Class,satic.

What is a stack? Is it a heap or a stack?

Before I was a beginner in C + +, I was misled, saying stacks are heaps ... In fact, this is a misreading of the translation, the stack, in fact, should be translated into a stack more appropriate, and the heap is distinguished, because the English stack is the meaning of stacks, located in RAM (random access memory, randomly accessed storage area), speed second only to register. Store basic variables and references, and the data in the stack can be shared. However, the size and lifetime of the data in the stack must be determined, which is a disadvantage of the stack .

The stack is not a heap, it is a stack. Heap is the storage of all Java objects (except for escape analysis).

   How is the local method stack and JVM stack divided? The JVM specification writes that each Java thread has its own independent JVM stack, the call stack for the Java method.at the same time, the JVM specification allows the native code to invoke Java code, and allows Java code to call the native method, and also requires each Java thread to have its own independent native method stack. are all conceptually defined by the JVM specification, not that a specific JVM implementation really has to open two separate stacks for each Java thread. As an example of an Oracle JDK/OPENJDK hotspot VM, it uses the so-called " mixed Stack"-The stack frame of the Java method in the same call stack and the stacks frame of the native method, So each Java thread actually has only one call stack , which incorporates the JVM stack and the native method stack concept. As in previous article 1 of the structure diagram:    Heap and stack at the data structure levelData structure inside.

Stack, the Chinese translation of stacks, in fact, refers to the stack, here is the data structure of the stack, not memory allocation inside the heap and stack. Stack is the structure of the advanced data, like your plate one by one pile up, the last one is stacked on the top.

stack data structure is relatively simple. Heap translated into heaps, is

An orderly tree.

JVM Heap, Stack and C, C + + heap, the same stack? Before answering this question, we have to answer the memory allocation policy when the program runs, and the theory of compiling principle is that: The program runs a memory allocation with three policies:
    • Static storage allocation: at compile time, you can determine the storage space requirements for each data target at run time, so that they will be allocated fixed memory space at compile time. This allocation policy requires that mutable data structures (such as mutable arrays) are not allowed in the program code, and that nested or recursive structures are not allowed to appear. Because they all cause the compiler to not be able to calculate the exact storage space requirements.
    • Stack storage allocation: Also called dynamic storage allocation, and static storage allocation instead, stack is temporary! in a stacked storage scenario, it stores local variables, temporary variables, such as basic data types, object references ... From the memory allocation point of view, because the storage is the basic data type, the compiler has already known the size of the type, so directly can be effective memory allocation, such as int, the computer is aware of its scope, so directly by the system allocated in the stack, without the program to apply for XXX memory! The reference type, such as the definition of a class, it is obvious that this class is not known size, there should be a program to apply for memory space, so the heap to allocate! Stack allocation mode . Like the stacks we know in data structures, stack storage allocations are distributed according to the principles of advanced post-out.  
    • heap Storage allocation: is specifically responsible for the compile-time or runtime module entry Cannot determine the memory allocations for the data structures that are required for storage, such as variable-length strings and object instances. .

So we conclude that the heap is primarily used to store objects, and that stacks are primarily used to execute programs. This difference is mainly due to the characteristics of the heap and stack: for example, C + + ... All of the method calls are made via stacks, all local variables, formal parameters are allocated memory space from the stack, just like the transfer (conveyor belt) belt in the factory, Stack Top Pointer will automatically guide you where you put things, All you have to do is just put the things down. When you exit a function, you can destroy the contents of the stack by modifying the top pointer. This mode is the fastest, of course, to run the program.

Now, the previous article 1 has summarized that--JVM is a stack-based virtual machine. Each JVM instance allocates a stack for each newly created thread, and multiple threads share the only heap, that is, for a Java program, it runs through the operation of the stack. The stack holds the state of the thread in frames. The JVM performs only two operations on the stack: stack and stack operations in frames. When a thread is executing a method, we call this thread the current method, and the frame used by the current method is called the current frame. When a thread calls a Java method, the JVM will first press a new frame into the Java stack of the threads. This frame naturally becomes the current frame. During the execution of this method, this frame will be used to save the parameters of the method, local variables, intermediate calculation procedures and other data ... This frame is similar to the concept of the activity record in the compilation principle here.

Okay, wordy. For a long time, from this stack allocation mechanism, the stack can be understood as: Stack is the OS in the establishment of a process or a thread (in support of multi-threaded operating system is a thread), for this (process) thread set up the storage area, the zone has advanced post-out characteristics. The new data item on the stack is placed at the top of the other data, and you can only remove the topmost data (not overrun) when you remove it. Similar to this paper:

Again, every instance of the JVM has only one heap, and this unique heap is shared by the global thread! All class instances or arrays that the program creates in the run are placed in the heap and shared by all threads that apply them. The data item locations in the heap are not in a fixed order, and you can insert and delete them in any order, as they do not have the concept of "top" data.

Unlike C + +, the allocation of heap memory in Java is automated (managed by the automated garbage collector of Java Virtual Machines ), and the disadvantage is that the storage space for all objects in Java is allocated in the heap due to the dynamic allocation of memory at run time. But the object reference is allocated in the stack, and the memory allocated in the heap is the actual object itself, and the memory allocated in the stack is just a pointer (reference) variable to the object ( the value of the variable is equal to the first address of the array or object in the heap memory ). The heap memory management of C + + requires the programmer to manually manage the new,delete operator ...

  What are the characteristics of the stack allocation method in memory management? What are the pros and cons?

The first thought is the characteristics of the fast memory filo, as well as after the previous so wordy beep, and came to a conclusion: the data in the stack can be shared .

  int a = 3;  int

  The compiler processes int a = 3, creates a reference to a variable in the stack, and then finds out if there is a value of 3 in the stack, and if not found, stores the 3 in and then points a to 3. Then deal with int b = 3; After creating the reference variable for B, because there are already 3 values in the stack, B points directly to 3. in this case, A and B both point to 3. At this time, if again make a=4; Then the compiler will re-search the stack for 4 values, if not, then store 4 in, and a point to 4; If you already have one, point a directly to the address. Therefore the change of a value does not affect the value of B. It is important to note that this sharing of data with two object references also points to an object where this share is different, because the modification of a does not affect B, which is done by the compiler, which facilitates space saving. An object reference variable modifies the internal state of the object, affecting another object reference variable.

Advantages: Fast, do not manage memory, the disadvantage is too small, method calls over, easy memory overflow, and the stack is temporary, the data has a life cycle, belongs to temporary storage.

  On the actual computer physical memory point of view, stack and heap where?   

    1. Is it normally controlled by the operating system (OS) and Language runtime (runtime)?

    2. What is the scope of their action?

    3. What is the size of their decision?

    4. Which is faster?

    5. How does the JVM stack correspond to the OS?

Before you answer this question, you must first know that the mechanism of memory management differs depending on the compiler and processor architecture. To help understand, first summarize several principles:

  What is the principle of locality?

 The OS textbook writes to the principle of locality: When the CPU accesses the memory, whether it is access instruction or accessing the data, the storage units that are accessed tend to gather in a smaller contiguous region.

  I understand this: the computer's storage system from small to large, divided into registers, primary cache, level two cache, level three cache, memory, disk ... And the register is the CPU to store computing data, the CPU to work, need data or address, first from a cache inside, find out from the two-level cache inside to find, level two can not find to go to three levels to find ... If the target data is found on the disk, then the data will be put into memory, then into the level three cache, level two cache, a first-level cache, and finally the register to be used by the CPU. It can be said that the first level cache is the cache of registers, level two cache is a cache of cache, level three cache is two cache cache ... The next layer is the cache at the top level. And the local principle, the popular saying is because the CPU is running very very fast! It's high-speed storage! The speed of access between disk and memory is slow (I/O bottlenecks are not around ...). If the CPU needs more data on disk, memory ... This will take a lot of waiting time, so we set the cache! When the CPU uses a piece of data frequently, the computer will meet the sex to put it and the data in its vicinity into the cache, because the possibility of the pre-judgment of the data is very large, the computer put them to the closer register level, that is, the CPU access to the data, tend to be concentrated in a smaller contiguous region, which is the true meaning of the cache. Now, then, the question becomes the answer:

  How can a computer tell if a data is going to be used next?

    • Time locality : If a data is being accessed, it is likely to be accessed again in the near future. This is certainly true, and the data used may of course be used again.
    • Spatial Locality : the information that will be used in the near future is likely to be closer to the spatial address of the information being used now, The data next to the data address being used is, of course, likely to be used. Like arrays or something ...

Oh, yes. The previous questions have come to the conclusion that stacks and heaps are used to fetch memory from the underlying operating system. In a multithreaded environment, each thread can have his or her own completely separate stack, but they share the heap. Parallel access is controlled by the heap instead of the stack.

    • Heap: contains a list of used and idle memory blocks .

  

new allocations on the heap (with new or malloc) memory are found in the free blocks of memory to meet the requirements of the appropriate block. This operation updates the block list in the heap. These meta-information is also stored on the heap, often in a small area of the head of each block. Heap Additions new blocks usually extend from low addresses to high addresses, which means that the heap is growing upward! so you can think of the heap increasing in size as memory allocations. If the requested memory size is small, you usually get more memory than the request size from the underlying operating system. the application and release of many small blocks may produce the following states: There are a lot of small wasted free blocks between used blocks ... This results in the application of large chunks of memory failure, although the sum of the free blocks is sufficient, but the idle chunks are fragmented and do not meet the size of the request, which is called " memory Fragmentation ". when the used blocks with free blocks are released, the new free blocks may be merged with adjacent free blocks into a large free block, which effectively reduces the "fragmentation".

Heap management relies on the runtime environment, C uses malloc, free,c++ uses new and delete, but many languages have garbage collection mechanisms, such as the Java GC.

    • stack: stacks often work with SP registers, initially the SP points to the top of the stack (the Highland address of the stack). The stack is growing downward!

The CPU uses the push command to stack the data, using the pop command to bounce the stack. When using push to stack, the SP value is reduced (to the low address extension). The SP value increases when a pop is used to reload the stack. storing and retrieving data are the values of the CPU registers .

      • When the function is called, the CPU uses a specific instruction to push the current IP stack, then assigns the address of the calling function to the IP, allowing the CPU to invoke the function. When the function returns, the old IP is bounced, and the CPU continues to go to the code before the function call.
      • When entering a function, the SP expands downward to ensure that sufficient space is left for the local variables of the function. If there is a 32-bit local variable in the function, it will leave enough space of four bytes in the stack. When the function returns, the SP frees up space by returning to its original location.
        • If the function has parameters, the arguments are stacked before the function call. The code in the function locates the parameters and accesses them through the SP's current location.
        • Function nesting calls, each time a newly called function assigns a function parameter, the return value address, the local variable space, and the active record of the nested call are pressed into the stack. When the function returns, it is revoked in the correct manner.

Stack to be limited by the block of memory, continuous function nesting ... Allocating too much space to a local variable can cause a stack overflow. A CPU exception is triggered when the memory area in the stack has been used and continues to write down (low address). This exception is then translated into various types of stack overflow exceptions through the runtime of the language . In general, stacks are tightly integrated with the processor architecture at a lower level and can expand when the heap is insufficient. However, the extension stack is generally not possible because the execution thread is shut down by the operating system when the stack overflows, which is too late.

Here are some questions to answer:

  Is it normally controlled by the operating system (OS) and Language runtime (runtime)?

As mentioned earlier, heaps and stacks are collectively known and can be implemented in many ways . A computer program usually has a stack called the call stack, which stores information about the current function call (for example, the address of the keynote function, local variables), since the function call needs to be returned to the keynote function. Stacks extend and shrink to host information. In fact, the program is not controlled by the runtime, it is determined by the programming language, the operating system, or even the system architecture . A heap is a generic term (memory) that is dynamically and randomly allocated in any memory, that is, unordered. Memory is usually allocated by the operating system, and the API interface is called by the application to implement the allocation. There is some additional overhead involved in managing dynamically allocated memory, but this is handled by the operating system.

  What is the scope of their action?

The call stack is a low-level concept, which has little to do with the scope of the program. In the case of high-level languages, language has its own scope rules. Once the function returns, the local variables in the function are released directly. In the heap, it is also difficult to define. The scope is limited by the operating system, but the programming language may add some of its own rules to limit the scope of the heap in the application. The architecture and operating system uses virtual addresses, which are then translated by the processor into the actual physical address , as well as page faults and so on. They record that page belongs to that application. But you don't have to worry about it, because you only allocate and free memory in programming languages, and some error checking (the reason for allocation failure and release failure).

  What is the size of their decision?

Depends on the language, compiler, operating system and architecture. Stacks are usually allocated in advance, because stacks must be contiguous blocks of memory . The compiler or operating system of the language determines its size. Do not store large chunks of data on the stack, so that there is enough space to not overflow unless there is an infinite recursion or otherwise. A heap is a generic term for any memory that can be dynamically allocated . The size of it is variable. Working in modern processors and operating systems is highly abstract, so you don't need to worry about its actual size under normal circumstances unless you have to use memory that you haven't allocated or have already freed.

  Which is faster?

The stack is faster because all the free memory is contiguous, so there is no need to maintain a list of free memory blocks. Just a simple pointer to the top of the current stack. Compilers are typically implemented with a dedicated, fast register sp. More importantly, the subsequent operation on the stack follows the principle of locality.

  How does the JVM stack correspond to the OS?

Take the virtual memory distribution of a process in Linux as an example:

Figure No. 0 address at the bottom, the higher the memory address of the larger. As an example of a 32-bit operating system, a process can have a virtual memory address range of 0-2^32. It is divided into two parts, one for kernel (kernel virtual memory) and the other for the process itself, that is, the processes virtual memory in the diagram. The ordinary Java program uses the process virtual memory. The topmost part of the memory is called the user stack. This is the stack top pointer register for the stack stack,32, which is ESP with runtime heap in the middle. Is the heap, and notice that they are not the same as the stack and heap in the data structure. The previous summary, the stack is growing downward, and the heap is growing upward. when the program makes function calls, each function has a call frame (frame) on the stack.    Summary , summed up so much, and now the last question:   why the call to the method requires a stackActually, it's not a method call. need to use stackTo achieve, but it is designed to be implemented with stacks! We know that the activity records of each method (that is, local or automatic variables) are allocated on the stack, which not only stores these variables, but also can be used to nest methods of tracing. As we observe, we can see that the method is called as follows:1, calculation parameters, parameter transfer
2, the return address of the Save method
3, control transfer to callee
4, save the necessary caller on site
The order between the steps above is variable, but in theory, no step has to be implemented with stacks. Theoretically, if there are a lot of registers, we can completely abandon the stack, but actually we don't,so from a realistic point of view, the stack is a suitable implementation method, simply said isMethodThe live time of the called local data satisfies the order of "advanced Out (FILO)", which is recorded because the basic operation of the stack is just the support of this sequential access. And the heap is impossible to achieve.

The article summarizes resources: "Java programming thought", "Modern Operating System", "in-depth understanding computer system", "Modern Compiling Principle", "deep understanding Java Virtual Machine", "JVM specification 7", Know, StackOverflow ...

JVM Learning (2)-what is often said in the technical articles, stacks, stacks, what the stack is, summarizes from the OS perspective

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.