Implement OOP using assembly

Source: Internet
Author: User

Implement OOP using assembly

OOP and process-oriented are both ideas in programming, and paradigm is used for academic purposes. Someone once said that since cfront generates C code, you can implement OOP using C itself and even assembly, but there are too many things that need to be done manually. Indeed, the process orientation has been used in assembly design for a long time, and OOP has been at the same point with assembly (TASM introduced the concept of OOP before 95 years ). It is not formal to compile and implement OOP. It cannot provide Strong-typed such as C ++ and other security guarantees (such as access permissions ). Encapsulation is only conceptual and consciously observed.

OOP has several key points: encapsulation, inheritance, and polymorphism. The specific manifestation is to put data and operation data functions together, and put data in objects, providing interfaces for access. Inheritance implements semantic or implementation inheritance, which is also reflected in two aspects: Conceptual Hierarchy and code reuse. Polymorphism uses pointers to implement the use of pointer or reference to realize the polymorphism of the same function in different inheritance classes.

The object model of OOP has several implementation methods, which are described in detail in "inside C ++ object model:

1. Only put data in the object, and associate member-function with class through name mangling technology.

2. Single table model: place the pointer of member function into a separate table and put the entry address of the table into an object (a class corresponds to a table ). This is manifested in C ++ as Vtbl and Vptr. This model achieves dynamic flexibility during runtime, although it has two more dereference times.

3. In the double table model, data and functions are divided into two tables, and the entry addresses of the two tables are stored in objects, so that a single object has a fixed size.

4. Simple model. This is the model used in actual compilation. That is, the data is saved in the object and the function address is saved. Both TASM and MASM do this.

In terms of efficiency, the C ++ approach is optimal. The fourth method of assembly is mandatory to achieve simplicity. To some extent, it is contrary to the efficient spirit of compilation.

TASM is no longer commonly used, and its OOP practices are similar to those of MASM. This article mainly discusses the OOP practices of MASM. The authors are NaN and Thomas Bleeker. The implementation method is to use macro definition to achieve the background work that should have been done by the compiler. There are many macro skills. However, the final use is quite simple. Macro definition is placed in an OBJECTS. INC file. The asm file contains this inc and can use this object model.

Although macros are well-developed, MASM lacks the syntax feature that supports OOP. It is troublesome to use in many aspects or has a cost for space time. For example, a virtual function that overwrites the base class must be manually completed each time. That is, in the hierarchy of inheritance, all the overwritable virtual functions above the parent class must be manually completed in the subclass. Although there are such shortcomings, OOP still brings many benefits to assembly. For example:

1. Compilation of better interaction with things in the Object-Oriented field such as COM and C ++. An example of calling com using assembly + OOP is provided. If compilation + OOP is used to write com, components suitable for high speed and small size can be generated.

2. expanded the scope of issues that can be solved by compilation, making it easier for assembler to manage and collaborate in writing. The author of this object model wrote a neural network-based handwritten letter recognition program using assembly + OOP, less than 200 k (most of which is the space occupied by image files ).

--------------------------------------------------------------------------------

Use

Define a base class.

; Prepare the function prototype.

Shape_Init PROTO: DWORD

Shap_destructorPto typedef proto: DWORD

Shap_getAreaPto typedef proto: DWORD

Shap_setColorPto typedef proto: DWORD,: DWORD

--------------------------------------------------------------------------------

Is actually the definition of STRUC

CLASS Shape, Shap

CMETHOD destructor

CMETHOD getArea

CMETHOD setColor

Color dd?

Shape ENDS

--------------------------------------------------------------------------------

. Data

; Initialization

BEGIN_INIT

Dd offset Shap_destructor_Funct

Dd offset Shap_getArea_Funct

Dd offset Shap_setColor_Funct

Dd NULL

END_INIT

--------------------------------------------------------------------------------

. Code

Shape_Init PROC uses edi esi lpTHIS: DWORD

; Actual call Initialization

SET_CLASS Shape

; Set edi assmue to Shape

SetObject edi, Shape

; The DPrint macro is defined separately.

DPrint "Shape Created (Code in Shape. asm )"

; Cancel assmue

ReleaseObject edi

Ret

Shape_Init ENDP

Shap_destructor_Funct PROC uses edi lpTHIS: DWORD

SetObject edi, Shape

DPrint "Shape Destroyed (Code in Shape. asm )"

ReleaseObject edi

Ret

Shap_destructor_Funct ENDP

Shap_setColor_Funct PROC uses edi lpTHIS: DWORD, DATA: DWORD

SetObject edi, Shape

Mov eax, DATA

Mov [edi]. Color, eax

DPrint "Shape Color Set !! (Code in Shape. asm )"

ReleaseObject edi

Ret

Shap_setColor_Funct ENDP

Shap_getArea_Funct PROC uses edi lpTHIS: DWORD

SetObject edi, Shape

DPrint ""

DPrint "SuperClassing !!!!! This allows code re-use if you use this method !! "

DPrint "Shape's getArea Method! (Code in Shape. asm )"

Mov eax, [edi]. Color

DPrint "Called from Shape. getArea, (Code in Shape. asm )"

DPrintValH eax, "This objects color val is"

DPrint ""

ReleaseObject edi

Ret

Shap_getArea_Funct ENDP

Inherit this class

Include Shape. asm; Inherited class info file

Circle_Init PROTO: DWORD

Circ_destructorPto typedef proto: DWORD

Circ_setRadiusPto typedef proto: DWORD,: DWORD

Circ_getAreaPto typedef proto: DWORD

--------------------------------------------------------------------------------

CLASS Circle, Circ

; Inherit the original data and functions

Shape <>; Inherited Class

CMETHOD setRadius

Radius dd?

Circle ENDS

--------------------------------------------------------------------------------

. Data

BEGIN_INIT

Dd offset Circ_destructor_Funct

Dd offset Circ_setRadius_Funct

Dd NULL

END_INIT

--------------------------------------------------------------------------------

. Code

Circle_Init PROC uses edi esi lpTHIS: DWORD

; Initialize and implement inheritance

SET_CLASS Circle INHERITS Shape

SetObject edi, Circle

; Equivalent to Resetting vptr by the constructor

OVERRIDE getArea, CirleAreaProc

DPrint "Circle Created (Code in Circle. asm )"

ReleaseObject edi

Ret

Circle_Init ENDP

Circ_destructor_Funct PROC uses edi lpTHIS: DWORD

SetObject edi, Circle

DPrint "Circle Destroyed (Code in Circle. asm )"

; Implements the call of basic functions

SUPER destructor

ReleaseObject edi

Ret

Circ_destructor_Funct ENDP

Circ_setRadius_Funct PROC uses edi lpTHIS: DWORD, DATA: DWORD

SetObject edi, Circle

Mov eax, DATA

Mov [edi]. Radius, eax

DPrint "Circle Radius Set (Code in Circle. asm )"

ReleaseObject edi

Ret

Circ_setRadius_Funct ENDP

CirleAreaProc PROC uses edi lpTHIS: DWORD

LOCAL TEMP

SetObject edi, Circle

SUPER getArea

Mov eax, [edi]. Radius

Mov TEMP, eax

Finit

Fild TEMP

Fimul TEMP

Fldpi

Fmul

Fistp TEMP

Mov eax, TEMP

DPrint "Circle Area (integer Rounded) (Code in Circle. asm )"

ReleaseObject edi

Ret

CirleAreaProc ENDP

Generate an object based on the class and use

DEBUGC equ 1

. 586

. Model flat, stdcall

Option casemap: none

Include \ masm32 \ include \ windows. inc

Include \ masm32 \ include \ masm32.inc

Include \ masm32 \ include \ kernel32.inc

Include \ masm32 \ include \ user32.inc

Includelib \ masm32 \ lib \ kernel32.lib

Includelib \ masm32 \ lib \ user32.lib

Includelib \ masm32 \ lib \ masm32.lib

Include Dmacros. inc

Include Objects. inc

Include Circle. asm

. Data

. Data?

HCircle dd?

. Code

Start:

; Recuse all inherited constructors... and do all inits

DPrint ""

DPrint ">>> main. asm <[mov hCircle, $ NEW (Circle)]"

Mov hCircle, $ NEW (Circle)

DPrint ""

DPrint ">>> main. asm <[METHOD hCircle, Circle, setColor, 7]"

METHOD hCircle, Circle, setColor, 7

DPrint ""

DPrint ">>> main. asm <[METHOD hCircle, Circle, setRadius, 2]"

METHOD hCircle, Circle, setRadius, 2

DPrint ""

DPrint "------------ test polymorphic method hCircle. getArea -------------"

DPrint ""

DPrint ">>> main. asm <[DPrintValD $ EAX (hCircle, Circle, getArea), 'area of hcircle']"

DPrintValD $ EAX (hCircle, Circle, getArea), "Area of hCircle"

DPrint ""

DPrint "------------ test polymorphic method hCircle. getArea -------------"

DPrint ""

DPrint ">>> main. asm <[DPrintValD $ EAX (hCircle, Shape, getArea), 'area of hcircle']"

DPrint "Typing calling this Ojbect Instance as a SHAPE type only! This is the true value"

DPrint "of Polymorphism. We dont need to know its a Circle object in order to get"

DPrint "proper area of this instance object, that is inherited from Shape ."

DPrint ""

DPrintValD $ EAX (hCircle, Shape, getArea), "Area of hCircle"

DPrint ""

DPrint ""

DPrint ">>> main. asm <[DESTROY hCircle]"

DESTROY hCircle

DPrint ""

DPrint ""

DPrint "NOTE: superclassing here, as each destructor call's the SUPER destructor"

DPrint "To properly clean up after each class. To see SUPER classing in"

DPrint "in the Polymorphic getArea Function. Uncomment the SUPER code in"

DPrint "CircleAreaProc, and re-compile"

Call ExitProcess

End start

It looks messy, but it looks neat. It consists of four parts.

The first part is the declaration of each member function. In particular, there must be a "class name_init" function, which is a class constructor. The name is the one that cannot be changed.

The second part is the class-guided function declaration, which is actually a STRUC, that is, a struct. The base class definition is included to achieve structural inheritance. (Data inheritance is completed by calling SET_CLASS In the constructor ).

The third part is the initialization sequence (BEGIN_INIT, END_INIT) placed in. data ). It is equivalent to the vtbl of C ++, but also includes the initial value of the data member of the object.

The fourth part is the implementation of each member function. Specifically, the SET_CLASS to be called in the constructor and the OVERRIDE that may be called have completed data inheritance and virtual function rewriting.

In actual use, you can easily use some existing examples. It can indeed bring great convenience.

--------------------------------------------------------------------------------

Principle

All mysteries are in Object. inc, which defines the following macros.

; -- ===================================================== ========================================================== ====== --

; Macro list index:

; -- ===================================================== ========================================================== ====== --

NEWOBJECT: creates an object.

; METHOD calls functions in an object

; DESTROY destroy object (MUST)

SetObject considers the pointer in the register as a pointer of a certain Structure

; ReleaseObject cancels this "think"

OVERRIDE: rewrite the address of the function in the table to realize polymorphism.

; SET_CLASS implements initialization. If necessary, implement inheritance (MUST)

SUPER support for calling basic functions

;

; $ EAX () Accelerated METHOD, returns in eax

; $ EBX () Accelerated METHOD, returns in ebx

; $ ESI () Accelerated METHOD, returns in esi

; $ EDI () Accelerated METHOD, returns in edi

; $ NEW () Accelerated NEWOBJECT, returns in eax

; $ SUPER () Accelerated SUPER, returns in eax

; $ DESTROY () Accelerated DESTROY, returns in eax

; $ Invoke () Accelerated invoke, returns in eax

;

; BEGIN_INIT mark the initialization information (MUST) in the Data Segment)

; END_INIT indicates the end of the mark (MUST)

;

; CLASS is STRUCT (MUST)

; SET_INTERFACE To Declair Abbreviated Interface and Abv Name (MUST)

CMETHOD declares functions in the class or interface

;

; -- ===================================================== ========================================================== ====== --

The number of macros is not much, but it does complete the background work that the compiler has done for us. I will only list and explain the code before and after the macro scale. It is inconvenient to explain the specific implementation of Macros in detail because it involves a lot of syntax and skills (in fact, I just checked the manual and read it a little bit ).

Let's take a look at the CLASS first. This is the entry point of course.

In fact, it is very easy to replace the Class with STRUC.

CLASS Shape, Shap

CMETHOD destructor

CMETHOD getArea

CMETHOD setColor

Color dd?

Shape ENDS

After replacement

Shape STRUC

CMETHOD destructor

CMETHOD getArea

CMETHOD setColor

Color dd?

Shape ENDS

Naturally, let's see how CMETHOD works.

CMETHOD destructor

It becomes

Destructor PTR Circ_destructorPto?

The whole is expanded:

Shape STRUC

Destructor PTR Circ_destructorPto?

GetArea PTR Circ_getAreaPto?

SetColor PTR Circ_setColor?

Color dd?

Shape ENDS

^ _ ^: The struct contains function pointers and data. Then the clues are broken. Defining a structure like this is definitely not acceptable. So let's start with the generation of objects, and how new is implemented.

NEWOBJECT Circle

-->

Invoke GetProcessHeap

Invoke HeapAlloc, eax, NULL, SIZEOF Circle

Push eax

Invoke Circle_Init, eax

Pop eax

This shows a very obvious defect, that is, it must be used in win32, because of the use of win32api. You can replace an api with an external function. Then you can use the function to change the dynamic memory allocation on different platforms.

The generated code is very simple, that is, allocating memory and then calling the object constructor. Here, the mandatory requirement class constructor should be in the form of "class name_init. Although it is not a big limitation, it is not very nice. This also makes sense. You can avoid overhead caused by flexibility by using pointers by writing a pair of names in programming. The following shows that the Destructor uses pointers, this is because virtual desturctor is used by default.

Well, let's move forward to the constructor. Let's see how the constructor is written:

Shape_Init PROC uses edi esi lpTHIS: DWORD

SET_CLASS Shape

SetObject edi, Shape

DPrint "Shape Created (Code in Shape. asm )"

ReleaseObject edi

Ret

Shape_Init ENDP

LpTHIS is a pointer to an object. Here, an object is a sturct. The first line is the key. SET_CLASS is the most troublesome and skillful macro. Let's see how it works.

SET_CLASS Shape

-->

Push esi

Push edi

Cld

Mov esi, offset @ InitValLabel

Mov edi, lpTHIS

Mov ecx, @ InitValSizeLabel

Shr ecx, 2

Rep movsd

Mov ecx, @ InitValSizeLabel

And ecx, 3

Rep movsb

Pop edi

Pop esi

Push and pop are common practices for saving the site. Mov esi and offset @ InitValLabel are related to BEGIN_INIT. Offset @ InitValLabel is the address marked by BEGIN_INIT. In fact, this program has not done anything special. That is, the initialized data between BEGIN_INIT and END_INIT is assigned to the new object. LpTHIS is the address of this object. Since SET_CLASS always assumes that you call it in the constructor, lpTHIS certainly exists (as a constructor parameter ). Cld, rep movsealing and so on are compilation techniques for fast data migration. Check the manual to find out what it is. That is, try to move a dword or a dword at the beginning, and then move a byte to a byte until all of them are moved.

If inheritance is included, it will take a lot of trouble.

SET_CLASS Circle INHERITS Shape

-->

Push esi

Push edi

Cld

Mov edi, lpTHIS

Mov esi, offset @ InitValLabel

Mov eax, [esi]

Mov [edi], eax

Add esi, 4

Add edi, Inher

Mov ecx, (@ InitValSizeLabel-4)

Shr ecx, 2

Rep movsd

Mov ecx, (@ InitValSizeLabel-4)

And ecx, 3

Rep movsb

Pop edi

Pop esi

Because it inherits, You need to reset the destructor. Mov eax, [esi], mov [edi], and eax do this. Because the address of the Destructor has changed, you only need to inherit the following data members, including the pointer of the virtual function.

Next, destroy of the object

DESTROY hCircle

-->

Mov eax, hCircle

Push eax

Call dword ptr [hCircle]

Push eax

Invoke GetProcessHeap

Invoke HeapFree, eax, NULL, hCircle

Pop eax

Because the address of the Destructor is the first member of the object (struct), call is to call the destructor. Use win32api to release the applied memory.

Next is the destructor.

Circ_destructor_Funct PROC uses edi lpTHIS: DWORD

SetObject edi, Circle

DPrint "Circle Destroyed (Code in Circle. asm )"

SUPER destructor

ReleaseObject edi

Ret

Circ_destructor_Funct ENDP

This is the destructor of the circle class inherited by shape. There is a SUPER that calls the functions in the base class. Let's continue to look at its implementation.

SUPER destructor

-->

Invoke Circ_destructorPto PTR [(INHER_initdata + INHER. MethodName)], lpTHIS

Circ_destructorPto specifies the function of the address type. INHER is a global item in the macro, indicating the base class name of the class. The structure of INHER_initdata + INHER. MethodName is the actual address of the class in the base class.

The rest is actually using the functions in the object (you are "not authorized" to operate on the data in the object, although conceptual. In fact, you can destroy the OOP ideas embodied here. Because compilation does not provide such protection ).

METHOD hCircle, Shape, getArea

-->

Mov edx, hCircle

Invoke (Shape PTR [edx]). getArea, edx

The harvest season is over. This sentence reflects the thought of polymorphism. HCircle points to an object of the Circle class, but it is interpreted as a Shape class when called. Understand it by yourself. Where is polymorphism.

--------------------------------------------------------------------------------

My opinion

From a global perspective, we can find this object model.

Object Data and virtual function pointers are placed in the same table.

All functions are virtual.

The virtual function of the inherited class to rewrite the base class must be manually completed after the data is initialized (in the constructor)

Only access to virtual functions modified in the previous base class

Use win32api to allocate and release memory

Support for the following three features of OOP:

Encapsulation: no special protection (no Private) is provided for the data ). Data and function pointers are placed in the same struct to become an object. The interfaces provided to access data are completely self-conscious.

Inheritance: The Structure Inheritance is completed through the nesting of struct definitions (the definition contains the defined struct. Use SET_CLASS to inherit data. All inheritance is Public.

Polymorphism: the narrow sense of polymorphism means that calls to the same function will have different behaviors. Through the following comparison, we can see why this object model supports polymorphism (because it supports the rewriting of the base class function by the derived class ).

Class Shape

{

Virtual float getArea ();

......

};

Class Cicle: public Shape

{

Float getArea ();

......

};

When you call a virtual function in an object through an object pointer. In fact, you have specified the pointer type during compilation. For example:

Float getArea (Shape * shp)

{

Return shp-> getArea ();

}

Therefore, the compiler can query the compile-time information to determine the index position of the function you call in vtbl. Then this call will be replaced by a query vtbl and then a call. When running, the sent pointer shp is not necessarily the Shape type, but its inheritance type. In this way, the vtbl content of the two classes may be different (the derived class overrides some slot addresses ). Therefore, the function of the derived class can be called without knowing what the derived class is. The secret is that the derived classes and the base classes place their respective implementation versions in the same location as vtbl. The location is determined during the compilation phase, and the content of this location is determined during the runtime.

What about the object model of this assembly version? In fact, it is similar. Mov edx, hCircle, and invoke (Shape PTR [edx]). getArea, edx is a multi-state call. HCircle actually points to a Circle type object (here, the object has data and undertakes the vtbl task ). This pointer is designed to be interpreted as a Shape class during the call. That is, it is called according to the index of getArea in the shape type. Polymorphism occurs when the same index is indexed to different functions.

--------------------------------------------------------------------------------

Possible improvements

Virtual Functions

In fact, to be honest, this object mode is really good, and the macro function is maximized. However, it requires that all classes have a virtual destructor and all functions are virtual. Within the range of macro capabilities, definitions and calls have been made as simple as possible, and they are indeed pleasing to the eye. However, I don't think putting all functions in an object (mandatory as a virtual function) will increase the cost.

C ++ regards non-virtual member functions as common functions and places them outside the objects. In fact, I think this object model can also be used. The existing mandatory limitation of some member function prototype names to "type name_function name Pto" is not as good as providing a macro.

My suggestion is to put some functions into the object (that is, to declare them in the class with CMETHOD ). Others are not put, but are written in the same file. Then, when the METHOD is used to call the member-function of an object, it determines whether the member-function exists in the virtual function table (that is, a series of function pointers stored by the object) during compilation. If not, it is called like a common function. If yes, it is called in the current mode.

The METHOD macro is also written. In the macro "SUPER", we can check whether the method appears in the base class for the first time (and also check whether it is the last layer rather than the multi-layer base class ). This METHOD can also check whether the called METHOD has appeared in the class, and then use different function call methods.

About SUPER

In this model, only the overwritten functions of the previous layer (that is, the parent class) can be called, and the overwritten functions of the previous layer cannot be SUPER. The obstacle is that you cannot know the type of the function that appears for the first time. If you manually provide this class name, you can overwrite the SUPER function at any level. Like this:

SUPER getArea

SUPER getArea, Shape

If no specific class name is provided, it is regarded as the top layer of SUPER. Otherwise, the specific class name will be used for SUPER. The secret of SUPER is to query the data in "type name_initdata" in the. data Segment to restore the function that has been rewritten.

The TABLE in TASM can make up for the OVERRIDE defect, which is a better STRUC. But that is the result of Borland's enhanced syntax. In any case, Thomas does this very well.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.