Implementation of polymorphism in Java polymorphism and C + +

Last Update:2015-08-31 Source: Internet

Author: User

Tags instance method

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Daniel's article is worth replying to http://www.ibm.com/developerworks/cn/java/j-lo-polymorph/

Paste over a lot of pictures lost/(ㄒoㄒ)/~~

It is well known that polymorphism is an important feature of an object-oriented programming language that allows pointers or references to base classes to object to derived classes, and to implement dynamic binding of methods at specific accesses. C + + and Java, as the two most popular object-oriented programming languages, how the internal support for polymorphism is implemented, this article makes a comprehensive introduction.

Note that in this article, pointers and references are used interchangeably, and they are only an abstract concept that represents a connection to another object without worrying about its specific implementation.

How Java is implemented

Java implementations of dynamic binding for method calls rely primarily on method tables, but implementations that are invoked through class reference calls and interface references are different. In general, when a method is called, the JVM first looks for the corresponding constant pool, obtains a symbolic reference to the method, and finds the method table that invokes the class to determine the direct reference to the method, and finally calls the method. The following are detailed descriptions of the relevant sections involved in the process.

Structure of the JVM

The runtime structure of a typical Java virtual machine is as shown

Figure 1.JVM Run-time structure

In this structure, we only discuss the method area, which is closely related to this paper. When a program is running that requires a definition of a class, the load subsystem (class loader subsystem) loads the required class file and internally establishes the type information for that class, which is stored in the method area. Type information typically includes the method code for the class, the class variable, the definition of the member variable, and so on. It can be said that the type information is the internal structure of the Java file of the class at runtime, and contains all the information defined in the Java file of the modified class.

Notice that the type information is different from the class object. A class object is an object that the JVM creates in the heap after loading a class to represent that class, and that type information can be accessed through that class object. The most typical application, for example, is to apply a class object in Java reflection to all methods supported by the class, defined member variables, and so on. As you can imagine, the JVM maintains references to each other in type information and class objects to access each other. The relationship between the two can be analogous to the relationship between the process object and the real process.

Method invocation Methods for Java

There are two types of method calls in Java, dynamic method calls and static method calls. static method invocation refers to the way that static methods of a class are called, statically bound, and dynamic method calls require that the object that the method call is acting on is dynamically bound. A class invocation (invokestatic) is a case in which a specific invocation method has been determined at compile time, whereas an instance invocation (invokevirtual) determines the specific invocation method at the time of the call, which is dynamic binding, which is also the core problem to be solved by polymorphism.

There are four method invocation directives for the JVM, Invokestatic,invokespecial,invokesvirtual and Invokeinterface, respectively. The first two are static bindings, and the latter two are dynamically bound. This article can also be described as a review of two implementations of the JVM's subsequent invocations.

Chang (Constant Pool)

A constant pool holds some constant information referenced by a Java class, including some string constants and symbolic reference information for the class. The Chang in the class file generated by the Java code compilation is a static constant pool, and when the class is loaded inside the virtual machine, the constant pool that produces the class in memory is called the run-time pool.

Chang can be logically divided into tables, each containing a class of constant information, and this article explores only the constant pool tables associated with Java calls.

Constant_utf8_info

A string constant table that contains all the string constants used by the class, such as string references in code, referenced class names, names of methods, string descriptions of other referenced classes and methods, and so on. Any constant strings involved in the remaining constant pool tables are indexed to the table.

Constant_class_info

A class information table that contains the symbolic references of any referenced class or interface, each of which consists primarily of an index, pointing to the Constant_utf8_info table, representing the fully qualified name of the class or interface.

Constant_nameandtype_info

The name type table that contains the index of the name and descriptor information of any method or field referenced in the string constant table.

Constant_interfacemethodref_info

An interface method refers to a table that contains descriptive information about any interface method referenced, mainly including the class information index and the moniker type Index.

Constant_methodref_info

Class method refers to a table that contains descriptive information about any type of method referenced, mainly including the class information index and the moniker type Index.

Figure 2. The relationships of the tables in a constant pool

As you can see, given the index of any method, after finding the corresponding entry in the constant pool, we can get the class index (CLASS_INDEX) and the name Type Index (NAME_AND_TYPE_INDEX) of the method, and then get the type information and the name and descriptor information of the method (parameter , return value, etc.). Notice that all the constant strings are stored in constant_utf8_info for other table indexes.

Method tables and method calls

The method table is the core of a dynamic call and is the main way in which Java implements dynamic invocation. It is stored in the type information in the method area, contains all the methods defined by the type, and pointers to the method code, note that the specific method code may be either a method of overwriting, or a method that inherits from the base class.

If a class defines person, Girl, boy,

Listing 1

Class Person {public  String toString () {     return ' I ' m a person. ";          }  public void Eat () {} public  void Speak () {}          }  class Boy extends person{public  String toString () {     Return "I ' m a Boy";          }  public void Speak () {} publicly  void Fight () {}  }  class Girl extends person{public  String toString () { C15/>return "I ' m a Girl";          }  public void Speak () {} public  void Sing () {}  }

When these three classes are loaded into the Java virtual machine, the method area contains information about the respective classes. Girl and boy method tables in the method area can be represented as follows:

Figure 3.Boy and Girl method table

As you can see, the method table for Girl and boy contains methods that inherit from Object, inherit from the direct parent class person, and their new defined methods. Note that the method table entry points to the specific method address, such as Girl, which inherits from the method of object, only toString () points to its own implementation (Girl's method code), and the remainder points to the method code of object, which inherits from the method of person eat () and SP Eak () points to the method implementation of the person and the implementation of itself.

Any method of person or Object is the same as the position (index) of their method table and its subclasses Girl and the boy's method table. In this way, the JVM only needs to specify the first method in the invocation method table when invoking the instance method.

The call is as follows:

Listing 2

Class party{. void Happyhour () {person  girl = new Girl ();  Girl.speak (); ...         }  }

When compiling the party class, the generated girl.speak() method invocation is assumed to be:

Invokevirtual #12

Set the calling code corresponding to Girl.speak (); #12 is the index of the constant pool of the party class. The process by which the JVM executes the calling instruction is as follows:

Figure 4. Parsing the calling procedure

The JVM first looks at the entry for the party's constant pool index of 12 (which should be the Constant_methodref_info type, which can be considered a symbolic reference to a method call), and further view Chang (Constant_class_info,constant_ Nameandtype_info, Constant_utf8_info) can conclude that the method to be called is the speak method of the person (note that the reference girl is its base class person type), view the method table of the person, and derive the Speak method in the method The offset in the table, which is the direct reference to the method invocation.

When a direct reference to a method call is resolved (the method table offset is a few), the JVM executes a real method call: The parameter this is called by the instance method to get the concrete object (that is, the object that girl points to in the heap), and the corresponding method table of the object (Girl's method table) is obtained. The method that points to an offset in the method table (the implementation of the Girl Speak () method) is then called.

Interface calls

Because Java classes can implement multiple interfaces at the same time, the situation is different when invoking a method with an interface reference. Java allows a class to implement multiple interfaces, in a sense equivalent to multiple inheritance, so the same method may be different in the location of the base class and the method table of the derived class.

Listing 3

Interface idance{    void Dance ();  }  Class Person {public  String toString () {    return ' I ' m a person. ";          }  public void Eat () {} publicly  void Speak () {}          }  class Dancer extends person  implements Idance {  Public String toString () {    return "I ' m a dancer.";          }  public void Dance () {}  }  class Snake implements idance{public  String toString () {    return "A Snake.";          }  public void Dance () {  //snake dance          }  }

Figure 5.Dancer Method Table (view larger image)

As you can see, the method Dance () that inherits from the interface idance is not the same in the method table of class dancer and Snake because of the interface intervention, obviously we cannot call dancer and Snake correctly by giving the method table offset. This is also why the calling interface method in Java has its own calling instruction (Invokeinterface).

Java's call to an interface method is in the form of a search method table, and the following method calls

Invokeinterface #13

The JVM first looks at the constant pool, determines the symbolic reference (name, return value, and so on) of the method invocation, and then uses the instance pointed to to get the instance's method table, and then searches the method table to find the appropriate method address.

Because the method table is searched for every interface call, the invocation of the interface method is always slower than the invocation of the class method.

How C + + is implemented

As you can see from the above, Java is dependent on the method table for the implementation of the polymorphic, but, more specifically, the support for the interface is very different, and each call is searched for the method table. In fact, in C + +, the implementation of single inheritance is very similar to Java for polymorphic implementations, but because of the support for multiple inheritance, which can come across the same problem as the Java support Interface dynamic invocation, the C + + solution is to take advantage of the multiple method table pointers of the object, which unfortunately introduces additional pointer adjustment complexity.

Single inheritance

Single-inheritance, C + + for polymorphic implementations is essentially the same as Java, and is based on the method table. However, C + + compiles at compile time to confirm the location of the method being called in the method table, and there is no procedure for the JVM to query the constant pool when the method is called.

When C + + compiles, the compiler does a lot of work automatically, one of which is to insert a variable vptr the method table of the class when needed. such as person, Girl class definition is similar to Java above, if

Listing 4

Class person{...  Public: Person     () {}     Virtual ~person () {};     virtual void Speak () {};     virtual void Eat () {};  }; Class Girl:public person{...    Public:    Girl () {}    Virtual ~girl () {};    virtual void Speak () {};    virtual void Sing () {};  };

The memory object model for the person and Girl instances is:

Figure 6.Person Object model with Girl

The calling code as follows

Person *p = new Girl ();  P->speak ();  P->eat ();

The code compiled by the compiler calls:

P->VPTR[1] (p);  P->VPTR[2] (p);

This will naturally transition to a call to the corresponding function of the Girl at run time.

You can see that there are no individual constructors in the method table, because the C + + method table contains only the virtual-modified methods, and the non-virtual methods are statically bound, and there is no need to occupy the space of the method table. This is different from Java, where the Java method table contains all of the methods supported by the class, and it can be said that all methods of the Java class are "virtual" (dynamic binding).

Multiple inheritance

Under multiple inheritance, the situation is completely different, because two distinct classes, which inherit from the same base class method, may have different positions in the respective method tables (similar to the interface in Java), but Java has JVM support at runtime, C + + A number of pointers to the method table are introduced here to solve this problem, which leads to additional complexity in adjusting the pointer position.

If there are three classes with the following relationship, Engineer inherits from person and Employee

Figure 7. Class static structure diagram

The Engineer instance object model is:

Figure 8.Engineer Object Model

You can see that the Engineer instance has two pointers to the method table, which is very different from Java.

With the following code,

Listing 5

Engineer *p = new Engineer ();  Person * P1 = (person *) p;  Empolyee *P2 = (Employee *) p;

Each pointer points to its own sub-object at run time, as follows:

Figure 7.Engineer Example

The pointer to an object in C + + always points to the beginning of the object, as in the preceding code, p is the starting address of the Engineer object, and P1 points to a pointer to the P transformation into a person child object, and you can see that the two are actually equal, but the pointer p2 of the Employee child object is the P and P1 Different, actually

P2 = p + sizeof (person);  P1->eat ();  P2->work ();

The generated calling code after compilation is:

* (P1->vptr1[i]) (p1)  * (P2->vptr2[j]) (p2)

In some cases, it is even necessary to adjust the this pointer to the beginning of the entire object, such as:

Delete P2;

The this pointer of the destructor is adjusted to the position that P points to, otherwise a memory leak occurs. When the destructor is set to 0 in the method table, it is compiled:

* (P2->vptr2[0]) (p)

For pointer adjustments, the compiler does not have enough knowledge to complete this task at compile time. As in the example above, for an object pointed to by P2, which may be an Employee or any subclass of that class (other subclasses such as Teacher, etc.), the compiler cannot exactly know the distance (offset) of the initial address of the P2 and the entire object, so that the adjustment can only occur at run time.

There are generally two ways to adjust pointers, such as:

Figure 8. Pointer adjustment-Extension method table

This method stores all adjusted offset of the pointer in each entry in the method table, and when the method in the method table is called, the value of offset is used to complete the pointer adjustment before making the actual call. The drawbacks are obvious, increasing the size of the method table, and not every method requires a pointer adjustment.

Figure 9. Pointer Adjustment-thunk Technology

This is called the thunk technique, where each entry of the method table points to a small piece of assembly code, which ensures that pointers are adjusted and the correct method is called, which is equivalent to adding a layer of abstraction.

Comparison of implementations in Java and C + + in polymorphism

The above respectively for the implementation of polymorphism in Java and C + + in a more detailed introduction, the following for the two languages of the similarities and differences in the implementation of a summary:

In the case of single inheritance, both implementations are essentially the same, using the method table to invoke the specific method by the offset of the method table.

Java's method table contains all of the instance methods defined by the Java class, while the C + + method table contains only the methods that need to be bound dynamically (the virtual decoration method). Thus, all instance methods in Java are called through the method table, whereas non-virtual methods in C + + are statically bound.

Any Java object only "points" to a method table, while C + + may point to multiple method tables under multiple inheritance, and the compiler guarantees the correct initialization of the multiple method tables.

The main problem of C + + in multi-layer inheritance is the adjustment of this pointer, the design is more elaborate and more complex, while Java is more intuitive when it is called by the interface, but the invocation efficiency is much slower than the instance method call.

As you can see, there are similarities between the two, and there are different places. Implementations of single inheritance are essentially the same, but there are subtle differences (such as method tables), and the biggest difference is support for multiple inheritance (multiple interfaces). In fact, because C + + is a statically compiled language, it does not have the means to invoke a dynamic "find" at run time, as Java does.

Implementation of polymorphism in Java polymorphism and C + +

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More