The memory structure of processes and threads and Java objects in Java "Go"

Last Update:2014-08-29 Source: Internet

Author: User

Tags windows support

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original address: http://rainforc.iteye.com/blog/2039501 1. Three ways to implement threads:Using kernel threads to implement kernel threads (Kernel thread, KLT) is a thread that is supported directly by the operating system kernel, which is the kernel that completes thread switching, and the kernel dispatches threads through the action scheduler and is responsible for mapping the threads ' tasks to individual processors. Programs generally do not go directly to the kernel thread, but instead use an advanced interface of kernel threads-lightweight process (light Weight PROCESS,LWP), the lightweight process is the thread that we normally mean, because each lightweight process is supported by a kernel thread, Therefore, only a kernel thread is supported before a lightweight process can be available. The 1:1 relationship between this lightweight process and kernel threads is called a one-to-one threading model. Lightweight processes consume a certain amount of kernel resources (such as the stack space of kernel threads), and the cost of system calls is relatively high, so the number of lightweight processes supported by a system is limited. Using the user thread implementation

Broadly speaking, as long as a thread is not a kernel thread, it can be considered as a user thread (Thread,ut), and the narrow user thread refers to the line libraries completely built on the user space, the system kernel can not perceive the implementation of the thread existence, user thread establishment, synchronization, Destruction and dispatch are completely done in the user state without the need for kernel help. If the program is implemented properly, this thread does not need to switch to the kernel state, so the operation can be very fast and low-consumption, can also support a larger number of threads, some high-performance database multithreading is implemented by the user thread. The 1:n relationship between this process and the user thread is called a one-to-many threading model. The advantage of using a user thread is that it does not require the support of the system kernel, the disadvantage is that without the support of the system kernel, all the threading operations need to be handled by the user program itself, so the programs implemented using the user thread are generally more complex, and fewer programs are now using the user thread. Java, the Ruby layer language has used the user thread, and eventually abandoned the use of it. Hybrid implementations

In a mixed environment, there are both user threads and lightweight processes. The user thread is still fully built into the user space, while the lightweight process supported by the operating system acts as a bridge between the user thread and the kernel thread. In this mixed mode, the number of user threads versus lightweight processes is variable and is a m:n relationship. Many Unix-series systems provide M:N threading model implementations.

2. The scheduling relationship between Java threads and the operating system

The old Java itself implemented the thread libraries, that is, the Java thread does not correspond to the operating system thread, the JVM on the operating system is a process, when the process is dispatched by the operating system, the JVM implements the thread libraries and then dispatches the Java thread, This is the initial many-to-one relationship (using the user thread implementation), why is this? Considering that the previous operating system kernel, such as Linux, did not directly support the thread before, the user thread and kernel thread is a many-to-one relationship, the Solaris is the same, so Java of course, your operating system can not perfectly support the thread, you let me realize is not difficult for me? In that era, the Java multi-threading scheduling is completely autonomous, the operating system does not know that Java is multithreaded, scheduling policy completely self-realization, a single CPU is definitely time-sharing, multi-CPU under the JVM will be seen to build multi-CPU instances of multi-JVM.
By then, the operating system kernel has been supporting multi-threaded (Windows support), then Java should also consider shirking some responsibility, so that the Java thread and operating system thread one by one (operating system scheduling, using kernel thread implementation) or a lot of correspondence (operating system and JVM scheduling, Hybrid implementation), at this time, if the one by one corresponds, then the thread's dispatch is completely given to the operating system kernel. Linux starts with the NPTL (Native POSIX thread Library) support from Kernel 2.6, but at this point the threads are inherently lightweight processes. Threads in Java are managed by the JVM, and how it corresponds to the operating system's threads is determined by the implementation of the JVM. The hotspot on Linux 2.6 uses the NPTL mechanism, the JVM thread has a one by one relationship with the kernel lightweight process (LWP)。 The scheduling of threads is completely given to the operating system kernel, and of course the JVM retains some policies that affect its internal thread scheduling, for example, under Linux, as long as a thread.run will invoke a fork to produce a thread. The way Ava threads are implemented on Windows and Linux platforms, it now seems, is how kernel threads are implemented. threads implemented in this way are directly supported by the operating system kernel-the kernel completes thread switching, the kernel implements thread scheduling by manipulating the scheduler (thread Scheduler), and the thread tasks are reflected on each processor。 Kernel threads are a single clone of the kernel. Instead of using the kernel thread directly, the program uses its advanced interface, the lightweight process (LWP), which is the thread. The scheduling relationship is as follows (note: KLT, kernel thread kernel, is "kernel clone".) Each KLT corresponds to one of the lightweight processes in Process P LWP (also known as threads), during which the user state, the kernel state of the switch, and the thread Scheduler on the processor CPU. Now Java uses a thread scheduling method that is preemptive scheduling. Each thread is assigned the execution time by the system, and the thread's switchover is not determined by the thread itself (in Java, Thread.yield () can yield execution time, but the thread itself is not able to get execution time). In this way of implementing thread scheduling, the execution time of a thread is system-controllable and does not cause the entire process to block because of one thread. In addition, preemptive thread scheduling is also related to thread priority, but thread prioritization is not very reliable, because Java threads are mapped to the native thread of the system, so thread scheduling is ultimately up to the operating system, although there is now a good-to-many operating system to provide thread-priority concept, However, it does not necessarily correspond to the Java thread's priority one by one. Therefore, you cannot rely too heavily on the priority level. 3. Java process and operating system process

In the JDK code, only the Processimpl class is provided to implement the Process abstraction class. It references the native create, close, waitfor, Destory and Exitvalue methods. In Java, the native method is a native method that relies on the operating system platform, and its implementation is implemented in a similar low-level language such as C + +. We can find the corresponding local method in the JVM's source code, and then analyze it. The JVM's implementation of the process is relatively straightforward, taking the JVM under Windows as an example. In the JVM, pass in Java the arguments passed in when the method is called to the operating system corresponding to the method to implement the corresponding function. Corresponding relationships such as:

The processes created in each JVM correspond to a process in the operating system. However, Java in order to give users a better and more convenient use, to the user to block some platform-related information, which users need to use the time, brought a little inconvenience.

When you create a system process using C + +, you can obtain the PID value of the process, which can be used to operate the corresponding process directly through the PID. In JAVA, however, the user can only operate through the reference to the instance, and when the reference is lost or unreachable, it is not possible to know any information about the process.

Of course, there are some things to be aware of when using the Java process:

Java provides a very limited amount of input and output pipe capacity, which can cause a process to hang or even cause a deadlock if not read in time.
When creating a process to execute system commands under Windows, such as: dir, copy, and so on. You need to run the command interpreter for Windows, Command.exe/cmd.exe, which depends on the version of Windows so that you can run system commands.
For pipelines in the Shell ' | ' Command, redirection command under each platform ' > ' cannot be implemented directly through command parameters, but requires some processing in Java code, such as defining a new stream to store standard output, and so on.

In summary, the process of the operating system is encapsulated in Java, which masks information about the operating system process. At the same time, use caution when using Java to provide a creation process to run local commands.

In general, the use of processes is to perform a task, while the modern operating system for the execution of the task of computing resources are generally configured to the thread as the object (the early Unix-like system because the thread is not supported, so the process is also a scheduling unit, but that is a relatively lightweight process, not in-depth discussion here). Creates a process in which the operating system actually creates the appropriate thread to run a series of instructions. In particular, when a task is rather large and complex, it may be necessary to create multiple threads to implement logically concurrently, and the threads are more visible. Therefore, it is necessary to understand the threads in Java in order to avoid any possible problems.

Java.lang.Process related classes have been introduced from JDK1.5, allowing the JVM to create processes by invoking APIs that correspond to the operating system platform. In general, a process is used to perform a task, and the modern operating system's configuration schedule for the compute resource that performs the task is typically a thread-based object. Creates a process in which the operating system actually creates the appropriate thread to run a series of instructions. When you need to perform a large complex task, you may need to create multiple threads to implement logically concurrently, and the threads are more visible. 4. Implementation of Java Threads

Conceptually, the creation of a Java thread essentially corresponds to the creation of a local thread (native thread), which corresponds to one by one. The problem is that the local thread should be doing local code, and the Java thread provides a Java method that compiles Java bytecode, so it is conceivable that the Java thread actually provides a uniform thread function that invokes the Java threading method through a Java virtual machine. This is done through a Java local method call.

The following is an example of the Thread#start method:

Public synchronized void Start () {      ...     Start0 ();      ... }

You can see that it actually calls the local method Start0, which declares the following:

Private native void Start0 ();

The thread class has a registernatives local method, and the main function of this method is to register some local methods for use by the thread class, such as Start0 (), Stop0 (), and so on, so to speak, all local methods that operate the local thread are registered by it. This method is placed in a static statement block, which indicates that when the class is loaded into the JVM, it is called and the corresponding local method is registered.

private static native void Registernatives ();   static{        registernatives ();   }

The local method registernatives is defined in the Thread.c file. THREAD.C is a very small file that defines common data and operations about threads that are used by each operating system platform, as shown in Listing 2.

Listing 2

 Jniexport void Jnicall java_java_lang_thread_registernatives (jnienv *env, Jclass cls) {(*env)->registernatives (en  V, CLS, Methods, Array_length (methods)); } static Jninativemethod methods[] = {{"Start0", "() v", (void *) &jvm_startthread}, {"Stop0", "(" OBJ ") v", (v OID *) &jvm_stopthread}, {"IsAlive", "() Z", (void *) &jvm_isthreadalive}, {"Suspend0", "() V", (void *) &jvm_ SuspendThread}, {"RESUME0", "() v", (void *) &jvm_resumethread}, {"SetPriority0", "(I) v", (void *) &jvm_ SetThreadPriority}, {"Yield", "() v", (void *) &jvm_yield}, {"Sleep", "(J) V", (void *) &jvm_sleep}, {" CurrentThread "," () "THD, (void *) &jvm_currentthread}, {" Countstackframes "," () I ", (void *) &jvm_ Countstackframes}, {"Interrupt0", "() V", (void *) &jvm_interrupt}, {"Isinterrupted", "(z) z", (void *) &jvm_ Isinterrupted}, {"Holdslock", "(" OBJ ") Z", (void *) &jvm_holdslock}, {"Getthreads", "() [" THD, (void *) &jvm_ Getallthreads}, {"Dumpthreads", "([" THD ") [[" STE, (void *) &jvm_dumpthreads},};

In this way, it is easy to see how the Java thread calls start, and actually calls the Jvm_startthread method, which is the logic of this method. In fact, what we need is (or Java performance behavior) that the method eventually calls the Java thread's Run method, which is indeed the case. In Jvm.cpp, there is the following code snippet:

Jvm_entry (void, jvm_startthread (jnienv* env, Jobject jthread)) ... native_thread = new Javathread (&thread_entry, SZ); ...

here Jvm_entry is a macro that defines the Jvm_startthread function, and you can see that a real platform-related local thread is created inside the function, and its thread function is Thread_entry, as shown in Listing 3.

Listing 3

static void Thread_entry (javathread* thread, TRAPS) {     Handlemark HM (thread);  Handle obj (THREAD, thread->threadobj ());  Javavalue result (t_void);  Javacalls::call_virtual (&result,obj,  klasshandle (Thread,systemdictionary::thread_klass ()),  Vmsymbolhandles::run_method_name (),  vmsymbolhandles::void_method_signature (), THREAD);  }

You can see that the Vmsymbolhandles::run_method_name method is called, which is defined in VMSYMBOLS.HPP with a macro:

Class Vmsymbolhandles:allstatic {... template (Run_method_name, "Run") ...}

As to how run_method_name is defined, this article does not repeat itself because of the cumbersome code details involved. Interested readers can view the source code of the JVM themselves.

First the Java thread's Start method creates a local thread (by calling Jvm_startthread), the thread's
The thread function is defined in Jvm.cpp thread_entry, which creates a real platform-related local thread, and further calls the Run method. You can see that the Java thread's Run method has no essential difference from the normal method, and the direct call to the Run method does not make an error, but it executes on the current thread without creating a new thread.
From the above we know that Java threads are built on the local thread of the system and are another layer of encapsulation, which has the following limitations for the interfaces provided by the Java Developer:
Thread return value
Java does not provide a way to get the exit return value of a thread. In fact, a thread can have an exit return value, which is typically stored in an online programming structure (TCB) by the operating system, and the caller can detect the value to determine whether the thread exits gracefully or terminates abnormally.
Synchronization of Threads
Java provides method Thread#join () to wait for a thread to end, which is generally sufficient, but one possible scenario is that it is not possible to wait on multiple threads (such as any one thread to end or all threads to end), and it is not feasible to loop through the Join method of each thread. This can lead to very strange synchronization problems.
The ID of the thread
The Java-provided method Thread#getid () returns a simple count ID that has nothing to do with the operating system thread's ID.
Thread Run Time statistics
Java does not provide a way to obtain statistical results of the elapsed time of a piece of code in a thread. Although you can use the timing method to achieve (get run start and end time, and then subtract), but because of the multi-threaded scheduling method, unable to get the actual CPU time used by the thread, and therefore must be inaccurate.
By analyzing Java processes and threads, you can see that Java encapsulates the two operating system "resources", allowing developers to focus on how to use both "resources" without having to worry too much about the details. This kind of package reduces the developer's work complexity, improves the working efficiency, on the other hand, because the encapsulation masks some features of the operating system itself, there are some limitations when using Java process threads, which is an unavoidable problem in encapsulation. The evolution of language is the process of deciding what does not need, and it is believed that with the development of Java, the functional subset of encapsulation will become more and more perfect.

5. Machine memory structure running Java application

When a machine runs a Java application, machine memory logically divides into Java heap and local non-Java heap, which is the layout of memory on a 32-bit machine
As you can see, the operating system and the C operating environment use approximately 1GB of space, and the Java heap uses 2GB,JVM with a local push using 1GB.

When we new a Java object, it takes up more memory space than we expected because, in addition to the information in the object itself, the JVM assigns a metadata to the object that describes the object's information. This metadata consists of three parts: class: An address to the class information that describes the type of object. Flags: Flags that describe the state of an object, including the object's hashcode, and whether the object is an array. Lock: The synchronization information for the object that indicates whether the current object is synchronized. is an integer object memory layout on a 32-bit machine

For an array object, its metadata contains one more field size, which is used to represent the length of the array whose memory layout is as follows

For more complex objects, such as referencing other objects inside the object, let's look at the memory layout of the string object

As you can see, a string object containing 8 characters (16 bytes), need to use 224bits to describe the string object, 256bits to describe the object in the array of characters, thus one occupies 480bits,60 bytes.
For 64-bit machines, the memory structure of the object is the same, but it takes up more memory space, as shown in:
For different types of data on 32-bit machines and on 64-bit machines the field occupies
With the compression algorithm (OOPs), the field size on a 64-bit machine can be compressed to 32bits, thus shortening the object's head to 12 bytes.

The memory structure of processes and threads and Java objects in Java "Go"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More