I. Overview
Nowadays, software is increasingly dependent on shared components developed by different vendors and authors, and component management becomes increasingly important. In this regard, an extremely important issue is the binary compatibility of different versions of the class. When a class is changed, can the new class directly replace the original class, but will not damage other components dependent on this class developed by different vendors/authors?
The main goal of the concept of Java binary compatibility is to promote the widespread reuse of software on the Internet, while it also avoids the basic class vulnerabilities faced by most C ++ environments-for example, in C ++, the access to the domain (data member or instance variable) is compiled into an offset relative to the starting position of the object. It is determined during compilation. If the class is added to a new domain and re-compiled, the offset changes accordingly.CodeIt cannot be executed normally. The same problem exists in virtual method calling.
In the C ++ environment, we usually recompile all the code that references the modified class to solve the problem. In Java, a few development environments adopt the same policy, but this policy has many restrictions. For example, assume that someone has developedProgramP, P references an external library L1, but the author of P does notSource codeL1 uses another library L2. Now L2 has changed, but L1 cannot be re-compiled, so the development and changes of P are also limited.
To this end, Java introduces the concept of binary compatibility-if the changes to L2 are Binary compatible, the changed L2, the original L1, and the current P can be smoothly connected, no errors.
First, let's look at a simple example. The authorization and hello classes come from two different authors. Authorization provides authentication and authorization services. The Hello class calls the authorization class.
Package com. author1;
Public class authorization {
Public Boolean authorized (string username ){
Return true;
}
}
Package com. author2;
Import com. author1 .*;
Class Hello {
Public static void main (string Arg []) {
Authorization auth = new authorization ();
If (Auth. Authorized ("myname "))
System. Out. println ("You have passed verification ");
Else
System. Out. println ("You failed to pass authentication ");
}
}
Now author1 has released the 2.0 version of the authorization class. The author of The Hello class author2 wants to use the new version of the authorization class without changing the original Hello class. The authorization of Version 2.0 is much more complex than the original one:
Package com. author1;
Public class authorization {
Public token authorized (string username, string PWD ){
Return NULL;
}
Private Boolean determineauthorization (string username, string PWD ){
Return true;
}
Public Boolean authorized (string username ){
Return true;
}
Public class token {}
}
The author author1 promises that the authorization class of Version 2.0 is Binary compatible with the class of version 1.0, or that the authorization class of Version 2.0 still meets the authorization class and Hello class conventions of version 1.0. Obviously, no matter which version of the authorization class is used when author2 compiles the Hello class, no error occurs. In fact, if the authorization class is upgraded, the Hello class does not need to be re-compiled. class can call any authorization. class.
This feature is not unique to Java. UNIX systems have a shared object library (. so file) concept, Windows system also has dynamic link library (. DLL file), as long as you replace the file, you can change one library to another. Just like the binary compatibility feature of Java, name connections are completed at runtime rather than at code compilation and connection, because it also has the advantages of Java binary compatibility, for example, you only need to re-compile a library to modify a part of the program. However, Java's binary compatibility also has its unique advantages:
(1) Java refined the granularity of binary compatibility from the entire library (which may contain dozens or hundreds of classes) to a single class.
(2) In C/C ++ and other languages, creating a shared library is usually a conscious action. An application generally does not provide many shared libraries, code sharing and code sharing are pre-planned results. However, in Java, binary compatibility has become an inherent natural feature.
(3) The shared object only applies to function names, but Java binary compatibility takes into account the overload, function signature, and return value types.
(4) Java provides a better error control mechanism. Incompatible versions can trigger exceptions, but can be easily captured and processed. In contrast, in C/C ++, incompatibility with the shared library version often causes serious problems.
Ii. Compatibility between classes and objects
The concept of binary compatibility is similar to the concept of object serialization in some aspects, and the goals of the two are also overlapped. When a Java object is serialized, the class name and domain name are written to a binary output stream. Objects serialized to a disk can be read using different versions of the class, the premise is that the names and domains required by this class exist and the types are consistent. The following table compares two concepts: binary compatibility and serialization.
|
Object serialization |
Binary Compatibility |
Applicable |
Object |
Class |
Compatibility requirements |
Class, Domain |
Class, domain, Method |
The deletion operation causes incompatibility. |
Always |
Not necessarily |
Whether the access attribute (public, private, etc.) is compatible after modification |
Yes |
No |
Binary Compatibility and serialization both take into account the constant update of the class version. Adding Methods and domains to the class is allowed, and pure addition does not affect the semantics of the program. Similarly, simple structure modification, for example, reorganizing the domain or method does not cause any problems.
Iii. Delayed binding
The key to understanding binary compatibility is to understand late binding ). Delayed binding is the name of the class, domain, and method that Java does not check until runtime, unlike the C/C ++ compiler, the name of classes, fields, and methods is cleared during compilation, and the offset value is replaced by the offset value, which is the key to Java binary compatibility.
Due to the adoption of the delayed binding technology, the names of methods, fields, and classes are not resolved until runtime, meaning that as long as the names (and types) of fields, methods, and so on are the same, the subject of the class can be replaced at will -- of course, this is a simplified statement. There are other rules that restrict the binary compatibility of the Java class, such as Access attributes (private and public) and whether it is abstract (if a method is abstract, it must not be called directly), but the delay binding mechanism is undoubtedly the core of binary compatibility.
Only when the binary compatibility rules are mastered can the class be rewritten to ensure that other classes are not affected. Here is an example. frodomail and sammail are two email programs:
Abstract class message implements classifiable {}
Class emailmessage extends message {
Public Boolean isjunk () {return false ;}
}
Interface classifiable {
Boolean isjunk ();
}
Class frodomail {
Public static void main (string a []) {
Classifiable M = new emailmessage ();
System. Out. println (M. isjunk ());
}
}
Class sammail {
Public static void main (string a []) {
Emailmessage M = new emailmessage ();
System. Out. println (M. isjunk ());
}
}
If we re-implement the message, we will not allow it to implement the classifiable interface. sammail can still run normally, but frodomail will throw an exception: Java. Lang. incompatibleclasschangeerror at frodomail. Main. This is because sammail does not require emailmessage to be a classifiable, but frodomail requires emailmessage to be a classifiable. The binary. Class file obtained by frodomail references the classifiable interface name. Methods that comply with the classifiable interface definition still exist, but this class does not mention the classifiable interface.
Iv. Compatibility rules: Methods
From the perspective of binary compatibility, a method consists of four parts: method name, return value type, parameter, and whether the method is static. Change any of the four projects. For JVM, it has become another method.
Take the "Boolean isvalid ()" method as an example. If isvalid receives a date parameter and changes it to "Boolean isvalid (date when)", the modified class cannot directly replace the original class, to access the isvalid () method of the new class, you can only get the following error message: Java. lang. nosuchmethoderror: Ticket. isvalid () Z. JVM uses the symbol "() z" to indicate that the method does not accept parameters and returns a Boolean value. For more details about this issue, refer to the following section.
JVM uses a technology called virtual method dispatch to determine the method body to be called. It determines the method body to be used based on the actual instance of the called method, it can be seen as an extended latency binding policy.
If the class does not provide a method that completely matches the name, parameter, and return value type, it uses the method inherited from the superclass. Due to the binary compatibility rules of Java, this inheritance is actually determined during runtime rather than during compilation. Suppose there are the following classes:
Class poem {
Void perform (){
System. Out. println ("day by day ");
}}
Class shakespearepoem extends poem {
Void perform (){
System. Out. println ("To be or not to be .");
}}
Class Hamlet extends shakespearepoem {}
So,
Poem poem = new Hamlet ();
Poem. Perform ();
"To be or not to be." Is output .". This is because the perform method body is determined at runtime. Although Hamlet does not provide the perform method body, it inherits one from shakespearepoem. As to why the perform method defined by poem is not used, it is because the perform defined by shakespearepoem already overwrites it. We can modify the Hamlet at any time without re-compiling shakespearepoem, as shown in the following example:
Class Hamlet extends shakespearepoem {
System. Out. println ("not even a mouse ");
}
Now, the previous example will output "nothing to do with a mouse ". However,
Poem poem = new shakespearepoem ();
Poem. Perform ();
The output result of this Code section is "To be or not to be." If we delete the content of shakespearepoem, the same code will output "complete the rest of the day ".
5. Compatibility rules: domain
The domain and method are different. After a method of the class is deleted, it may obtain a different method with the same name and parameters through inheritance, but the domain cannot be overwritten, this makes the domain performance different in binary compatibility.
For example, assume there are three classes:
Class language {
String greeting = "hello ";
}
Class German extends language {
String greeting = "Guten Tag ";
}
Class French extends language {
String greeting = "Bon jour ";
}
Then "Void test1 () {system. out. println (new French (). greeting);} the output result is "Bon jour". However, "Void Test2 () {system. out. println (language) new French ()). greeting);} ", the output result is" hello ". This is because the actually accessed domain depends on the instance type. In the first output example, test1 accesses a French object, so the output result is the French greeting. But in the second example, although it actually accesses a French object, however, since the French object has been finalized as a language object, the output result is a language greeting.
If you change the language in the previous example to the following format:
Class language {}
Run Test2 again (do not re-compile). The result is an error message: Java. Lang. nosuchfielderror: greeting. If Test2 is re-compiled, a compilation error occurs: cannot resolve symbol, Symbol: Variable greeting, Location: class language system. out. println (language) new French ()). greeting );. Test1 can still run normally and does not need to be re-compiled because it does not need the greeting variable included in the language.
6. in-depth understanding of delayed binding
The following classes are used to determine the temperature of the wine to drink and drink for dinner.
Class sommelier {
Wine recommend (string meal ){...}
}
Abstract class Wine {
// Recommended Temperature
Abstract float temperature ();
}
Class redwine extends wine {
// The temperature of red wine is usually slightly higher than that of white wine
Float temperature () {return 63 ;}
}
Class whitewine extends wine {
Float temperature () {return 47 ;}
}
Class Bordeaux extends redwine {
Float temperature () {return 64 ;}
}
Class Riesling extends whitewine {
// Inherit the temperature of the whitewine class
}
The following example uses the above class to recommend a wine:
Void example1 (){
Wine wine = sommelier. Recommend ("Duck ");
Float temp = wine. Temperature ();
}
In the second call of example1, the only thing we can ensure for the wine object is that it is a wine, but it can be Bordeaux, Riesling, or other. In addition, we can be sure that the wine object cannot be an instance of the Wine class itself, because the wine class is an abstract class. Compile the source code, wine in the source code. the temperature () call will be changed to "invokevirtual wine/temperature () F" (the class file actually contains the binary code in the text representation, this sort-based instruction description method is called the Oolong method), which represents a method call-a common (virtual) method call, rather than a static call. The method it calls is the temperature of the wine object. The "() F" parameter on the right is called signature, "() F "indicates that no parameter is required for the method. F indicates that the return value is a floating point number.
When JVM executes this statement, it does not necessarily call the temperature method defined by wine. In fact, in this example, JVM cannot call the temperature method defined by wine because the temperature method is a virtual method. JVM first checks the class to which the object belongs, and finds a method that complies with the name and signature features specified by the invokevirtual statement. If the method cannot be found, it checks the super class of the class and then the super class of the super class, until an appropriate method is found for implementation.
In this example, if the actually created object is Bordeaux, the JVM calls the temperature () f defined by the Bordeaux class, and the temperature () F method returns 64. If the object is a Riesling object, JVM cannot find the appropriate method in the riesling class, so continue to find the whitewine class and find a suitable temperature () F method in the whitewine class, the return value of this method is 47.
Therefore, the process of searching for available methods is to search for appropriate methods through string matching along the inheritance tree of the class. Understanding this principle helps you understand which modifications do not affect binary compatibility.
First, the method in the re-arrangement class obviously does not affect the binary compatibility-this is generally not allowed in C ++ programs, because the C ++ program uses the numeric offset rather than the name to determine the method to call. The key advantage of delayed binding is that, if Java also uses the offset of the method in the class to determine the method to be called, it is bound to greatly limit the playing of the Binary compatible system, even minor changes may cause a large amount of code to be re-compiled.
● Note: Some people may think that the C ++ processing method is faster than Java, because the search method based on the numeric offset must be faster than string matching. This argument is true, but it only describes the situation when the class is just loaded. Later, the JIT compiler in Java is dealing with the numeric offset, instead of relying on the string matching method to find a method, because classes cannot be changed after they are loaded into the memory, the JIT compiler does not have to worry about binary compatibility. Therefore, Java has no reason to be slower than C ++ at least in method calling.
Secondly, it is very important to check not only the class inheritance relationship during compilation, but also the class inheritance relationship during runtime JVM.
VII. Heavy Load and coverage
The most important thing to grasp through the previous example is that method matching is based on the method name and the text description of the signature. Next we will add some methods for the sommelier class:
Glass fetchglass (wine ){...}
Glass fetchglass (redwine wine ){...}
Glass fetchglass (whitewine wine ){...}
Compile the following code:
Void example2 (){
Glass;
Wine wine = sommelier. Recommend ("Duck ");
If (wine instanceof Bordeaux)
Glass = sommelier. fetchglass (Bordeaux) wine );
Else
Glass = sommelier. fetchglass (wine );
}
There are two fetchglass calls: the first call parameter is a Bordeaux object, and the second call parameter is a wine object. The Java compiler generates the following commands for the two lines of code:
Invokevirtual sommelier/fetchglass (lredwine;) lglass;
Invokevirtual sommelier/fetchglass (lwine;) lglass;
Note that the difference between the two is determined at compilation rather than runtime. JVM uses the "L <Class Name>" symbol to represent a class (just like F in the previous example). The input parameters for these two methods are either wine or redwine, the returned value is a glass.
The sommelier class does not provide the Bordeaux method as the input parameter, but one of the input parameters of the method is redwine. Therefore, the first method signature called uses the redwine method as the input parameter. As for the second call, we only know that the parameter is a wine object during compilation. Therefore, the compiled command uses the method where the input parameter is a wine object. For the second call, even if sommelier recommends a Riesling object, the actual call will not be fetchglass (whitewine), but fetchglass (wine), for the same reason, the called method is always a method that completely matches the signature.
In this example, different definitions of the fetchglass method are the overload relationship rather than the override relationship, because the signatures of these fetchglass methods are different. If a method overwrites another method, the two must have the same parameter and return value type. A virtual method call is to find a specific type at run time. It is only for overwriting methods (with the same signature), rather than for overloaded methods (with different signatures ). Reload method Parsing is completed at compilation, and overwrite method Parsing is performed at runtime.
If you delete fetchglass (redwine) without re-Compiling and run example2, the JVM will prompt the error message: Java. Lang. nosuchmethoderror: sommelier. fetchglass (lredwine;) lglass ;.
However, after this method is deleted, the compilation of example2 can still be successful, but then two sommelier. the same invokevirtual command is generated when fetchglass is called, that is, invokevirtual sommelier/fetchglass (lwine;) lglass ;.
If the fetchglass (redwine) method is put back again, fetchglass (redwine) will not be called unless example2 is re-compiled, and the JVM will use fetchglass (wine ). When the input object is a Riesling, it does not use fetchglass (whitewine) for the same reason: because the specific object cannot be determined during compilation ., Therefore, a more general method is used.
In the "invokevirtual wine/temperature () F" command, the JVM does not strictly adhere to the use of wine objects, but automatically searches for objects that actually implement the temperature method; but in the "invokevirtual sommelier/fetchglass (lredwine;) lglass;" command, JVM cares about redwine. Why? In the first command, wine is not a method signature, but only used to call the previous type check. In the second command, redwine is part of the method signature, JVM must search for the method to be called based on the method signature and method name.
Suppose we add a fetchglass Method to the sommelier class:
Class redwineglass extends glass {...}
Redwineglass fetchglass (redwine wine ){...}
Let's look at the original compiled example2, which uses the "invokevirtual sommelier/fetchglass (lredwine;) lglass;" command to call the fetchglass method. The newly added method does not automatically work because redwineglass and glass are two different types. However, if we re-compile example2, the example of calling Bordeaux will become "invokevirtual sommelier/fetchglass (lredwine;) lredwineglass ;".
To sum up, we can summarize the following important principles of Java binary compatibility:
(1) During compilation, the Java compiler selects the most matched method signature.
(2) When running, JVM searches for the exact matched method name and signature. Similar names and signatures are ignored.
(3) If an appropriate method is not found, JVM throws an exception and does not load the specified class.
(4) Reload methods are processed during compilation, and overwrite methods are processed at runtime.