Today I want to discuss the other side of Java, which we will not notice or use at ordinary times. More accurately, it is about underlying binding, native code, and how to implement some little magic. Although we will not explore how this is achieved at the JVM level, we will use this article to demonstrate some miracles.
I work in the RebelLabs team at ZeroTurnaround mainly for research, writing, and programming. This company mainly develops tools for Java developers, most of which run in Java plug-ins (javaagent. This is often the case. If you want to enhance JVM or improve its performance without rewriting JVM, you must thoroughly study the magical world of Java plug-ins. There are two types of plug-ins: Java javaagents and Native javaagents. This article mainly discusses the latter.
Anton Arhipov-XRebel product leader-gave a "Having fun with movie SIST" speech at the GeeCON conference in Prague. This speech can be used as a starting point for Java development.
In this article, we will create a small Native JVM plug-in to explore the possibility of providing Native methods to Java applications and how to use the Java Virtual Machine Tool Interface (jvm ti ).
If you want to obtain some dry goods from this article, it is necessary. In spoiler, we can calculate how many instances a given class contains in the heap space.
Suppose you are a trusted hacker genie of Santa Claus. Santa Claus has some challenges for you to do:
Santa: My dear hacker genie, can you write a program to figure out how many Thread instances are in the current JVM heap?
A genie who does not like to challenge himself may reply: very simple, isn't it?
Return Thread. getAllStackTraces (). size ();
But if we change the problem to any given class (not limited to Thread), how can we redesign our solution? Do we have to implement the following interface?
| The code is as follows: |
Copy code |
Public interface HeapInsight { Int countInstances (Class klass ); }
|
This is impossible, right? What if String. class is used as the input parameter? Don't be afraid. We only need to go deep into the JVM. For JVM library developers, you can use JVMTI, a Java Virtual Machine Tool Interface (Java Virtual Machine Tool Interface ). JVMTI has been added to Java for many years, and many interesting tools use JVMTI. JVMTI provides two types of interfaces:
Native API
The Instrumentation API is used to monitor and convert the bytecode loaded into the JVM class.
In our example, we will use the Native API. We want to use the IterateThroughHeap function. We can provide a custom callback function that can execute a callback function for each instance of a given class.
First, create an Native plug-in to load and display something to ensure that our architecture is okay.
The Native plug-in is implemented in C/C ++ and compiled into a dynamic library. It has been loaded before we start to consider Java. If you are not familiar with C ++, it doesn't matter. Many Genie are not familiar with it, and it is not difficult. When I write C ++, there are two main strategies: programming by coincidence and avoiding segment errors. Therefore, when I want to write down the code and instructions in this article, we can practice it again.
The first native plug-in is created below:
| The code is as follows: |
Copy code |
# Include # Include Using namespace std; JNIEXPORT jint JNICALL Agent_OnLoad (JavaVM * jvm, char * options, void * reserved) { Cout <"A message from my SuperAgent! "<Endl; Return JNI_ OK; } |
The most important part is that we declare an Agent_OnLoad function according to the documents of the dynamic link plug-in,
Save the file as a native-agent.cpp and let's compile it into a dynamic library.
I use OSX, so I can use clang for compiling. To save your effort in google search, the following is a complete command:
Clang-shared-undefined dynamic_lookup-o agent. so-I/Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/include/-I/Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/include/darwin native-agent.cpp
This will generate an agent. so file, which is the dynamic library for our use. To test it, we create a hello world class.
| The code is as follows: |
Copy code |
Package org. shelajev; Public class Main { Public static void main (String [] args ){ System. out. println ("Hello World! "); } } |
When you run the command, use the-agentpath option to correctly point to the agent. so file. You should see the following output:
| The code is as follows: |
Copy code |
Java-agentpath: agent. so org. shelajev. Main A message from my SuperAgent! Hello World! |
Doing well! Now, we are ready to make this plug-in really work. First, we need a jvmtiEnv instance. It can be obtained through 'javavm jvm 'during the execution of Agent_OnLoad, but it will not work in the future. Therefore, we must save it in a globally accessible place. We declare a global struct to save it.
| The code is as follows: |
Copy code |
# Include # Include Using namespace std; Typedef struct { JvmtiEnv * jvmti; } GlobalAgentData; Static GlobalAgentData * gdata; JNIEXPORT jint JNICALL Agent_OnLoad (JavaVM * jvm, char * options, void * reserved) { JvmtiEnv * jvmti = NULL; JvmtiCapabilities capa; JvmtiError error; // Put a jvmtiEnv instance at jvmti. Jint result = jvm-> GetEnv (void **) & jvmti, JVMTI_VERSION_1_1 ); If (result! = JNI_ OK ){ Printf ("ERROR: Unable to access JVMTI! \ N "); } // Add a capability to tag objects (Void) memset (& cap; a, 0, sizeof (jvmtiCapabilities )); Capa. can_tag_objects = 1; Error = (jvmti)-> AddCapabilities (& cap; ); // Store jvmti in a global data Gdata = (GlobalAgentData *) malloc (sizeof (GlobalAgentData )); Gdata-> jvmti = jvmti; Return JNI_ OK; } |
We also updated some code to enable the jvmti instance to use the object tag (tag: a value attached to the object, see the JVMTI documentation), because this is required when traversing the heap. All preparations are ready. We have initialized JVMTI instances. We use JNI to provide it to Java code.
JNI indicates Java Native Interface, which is a standard method for calling native code in Java applications. Java is quite simple and straightforward. Add the countInstances method definition to the Main class, as shown below:
| The code is as follows: |
Copy code |
Package org. shelajev; Public class Main { Public static void main (String [] args ){ System. out. println ("Hello World! "); Int a = countInstances (Thread. class ); System. out. println ("There are" + a + "instances of" + Thread. class ); } Private static native int countInstances (Class klass ); } |
To adapt to the native method, we must modify our native plug-in code. I will explain later, and now add the following function definition in it:
| The code is as follows: |
Copy code |
Extern "C" JNICALL jint objectCountingCallback (jlong class_tag, jlong size, jlong * tag_ptr, jint length, void * user_data) { Int * count = (int *) user_data; * Count + = 1; Return JVMTI_VISIT_OBJECTS; } Extern "C" JNIEXPORT jint JNICALL Java_org_shelajev_Main_countInstances (JNIEnv * env, jclass thisClass, jclass klass) { Int count = 0; JvmtiHeapCallbacks callbacks; (Void) memset (& callbacks, 0, sizeof (callbacks )); Callbacks. heap_iteration_callback = & objectCountingCallback; JvmtiError error = gdata-> jvmti-> IterateThroughHeap (0, klass, & callbacks, & count ); Return count; } |
Here the Java_org_shelajev_Main_countInstances method is more interesting. It starts with "Java", followed by the complete class names separated by "_", and finally the method names in Java. Do not forget the JNIEXPORT declaration, indicating that this method will be imported to the Java world.
In the Java_org_shelajev_Main_countInstances function, we first declare the objectCountingCallback function as the callback function, and then call the IterateThroughHeap function. Its parameters are passed in through the Java program.
Note that our native method is static, so the parameters corresponding to the C language are:
JNIEnv * env, jclass thisClass, jclass klass
For an instance method they wocould be a bit different: if it is an instance method, the parameters will be a bit different:
JNIEnv * env, jobj thisInstance, jclass klass
ThisInstance points to the instance that calls the Java method.
Now the definition of objectCountingCallback is provided directly according to the document. The main content is to increase an int variable.
Done! Thank you for your patience. If you are still reading the code, you can try to run the above code.
Re-compile the native plug-in and run the Main class. My results are as follows:
Java-agentpath: agent. so org. shelajev. Main
Hello World!
There are 7 instances of class java. lang. Thread
If I add a Thread t = new Thread (); in the main method, the result is 8. It seems that the plug-in actually works. The number of you will definitely be different from that of me. It's okay, because it counts statistics, compilation, GC, and other threads.
If you want to know the number of strings in the heap memory, you only need to change the class parameter. This is a real generic solution. I think Santa Claus will be happy.
If you are interested in the result, I will tell you that the result is 2423 String instances. The number of such a small program is quite large.
If:
Return Thread. getAllStackTraces (). size ();
The result is 5, not 8. Because it is not counted as a statistical thread. Do you need to consider this simple solution?
Now, through this article and related knowledge, I dare not say that you can start to write your own JVM Monitoring or enhancement tools, but this is definitely a starting point.
In this article, we have written a Java native plug-in from scratch to compile, load, and run it successfully. This plug-in uses JVMTI to penetrate the JVM (otherwise it cannot be done ). The corresponding Java code calls the native library and generates the result.
This is a common strategy used by many excellent JVM Tools. I hope I have explained some of these skills for you.