The object-oriented idea is difficult to adapt to this distributed software model, so the idea of componentized programming has developed rapidly.
According to the idea of componentized programming, complex applications are designed into small, single-functional component modules that can run on the same machine, it can also run on different machines.
In order to implement such an application, some extremely detailed specifications are required between the build program and the build program. Only when the component program complies with these common specifications can the system operate normally.
Therefore, OMG and Microsoft proposed the Common Object Request Breaker Architecture and COM (Component Object model) standards respectively. Currently, the CORBA model is mainly used on UNIX operating system platforms, COM is mainly used on the Microsoft Windows operating system platform.
In the COM standard, a component program is also called a module. It can be a dynamic connection library (DLL) and called an in-process component) it can also be an executable program (EXE) called out-of-process component ).
COM objects are based on binary executable code, while objects in C ++ and other languages are based on source code. Therefore, COM objects are language independent. This feature makes it possible to interact with Component Objects developed in different programming languages.
On the Microsoft Windows system platform, the COM technology is applied to all layers of the system. From the underlying COM Object Management to the upper-layer application interaction, the COM standard is used.
Overview
COM not only sets a standard for interaction between components, but also provides an environment for interaction. Because the interaction between component objects does not depend on any specific language, therefore, COM can also be a standard for collaborative development in different languages.
OLE technology is based on the COM specification. OLE gives full play to the advantages of the COM standard, making applications on Windows operating systems highly interactive. Without the support of OLE, the Windows operating system would be inferior.
However, the COM specification is not limited to OLE technology. In fact, OLE technology is only an application of COM. Over the past few years, OLE technology has shown great limitations in network interconnection, however, COM has shown great adaptability.
The COM standard consists of two parts: specification and implementation. The specification part defines the communication mechanism between components. These specifications do not depend on any specific language or operating system, it can be used in any language. The implementation part of the COM standard is the COM library, which provides some core services for the specific implementation of the COM standard.
COM is a software model for objects, so objects are one of its basic elements. Similar to the Object Concept in C ++, an object is an instance of a class, and a class is a group of related data and function groups and a definition together. Use an object application (or another object) to become a customer, and sometimes become an object user.
An interface is a set of logically related functions. Its functions are also called interface member functions. Objects provide various forms of services to customers through interface member functions.
In the COM model, the object itself is invisible to the customer. When the customer requests the service, it can only be performed through the interface. Each interface is identified by a 128-bit GUID (GUID, Globally Unique Identifier. The customer obtains the interface pointer through the GUID. Through the interface pointer, the customer can call its corresponding member functions.
In general, the interface remains unchanged. As long as the desired interface still exists in the build object, it can continue to use the service provided by this interface. Objects support multiple interfaces. Therefore, you can add interfaces to Upgrade Component Objects. The new interfaces do not affect the use of old interfaces.
How do customers identify COM objects? Similar to interfaces, each object is identified by a 128-bit GUID, called CLSID (class identifier, class identifier, or class ID). It can be ensured that the object is identified by CLSID (probability) global uniqueness.
As long as the system contains information about such COM objects, including the module files (DLL or EXE files) of COM objects and the COM object entry points in the code, the client program can create a COM object through the CLSID.
So how do customers use the services provided by COM objects? What do customers get?
In fact, after the customer successfully creates an object, it obtains a pointer to an interface of the object. Because the COM Object implements at least one interface, the customer can call all services provided by this interface.
However, the COM object can have its own State, which makes the customer feel the existence of the COM object. If the customer has two objects with the same CLSID at the same time, the two objects can have different states. The customer does not have to worry about how the COM object is implemented, and the relationship between the State data of the two objects (array or linked list ). Of course, the COM object can also be stateless. This COM object is mainly used to provide function services and can be used to replace the traditional API function interfaces to make the application programming interfaces more orderly, the organizational hierarchy is more advanced.
In addition to the specifications, Com also has the implementation part, including some core system-level code, which is exactly the core code, in this way, objects can interact with customers at the binary code level through interfaces.
In a Microsoft Windows operating system, these libraries exist as. DLL files, including the following:
(1) A small number of API functions are provided to create COM applications on both the client and server. On the client side, it mainly creates functions, while on the server side, it provides access support for some objects.
(2) com searches for the local server, that is, the EXE program, and converts the program name and CLSID through the registry.
(3) provides some standard memory control methods for applications to control the memory allocation in the process.
The com library is generally not implemented at the application layer, but at the operating system level. Therefore, an operating system only has one com library. In addition, the implementation of the COM library must depend on the specific system platform, especially the underlying standards of the system.
The com library ensures that all components interact with each other in a uniform manner, and that, when writing a com application, we do not need to write a large amount of Basic Code necessary for COM Communication, instead, it directly uses the APIS provided by the COM library for programming, which greatly speeds up development. For example, the current version of the COM Library supports remote components, that is, Distributed COM. We do not need to write any network or RPC (Remote Procedure Call) code, in this way, programs can communicate with each other on the network.
If we use an object-oriented language to implement COM objects, we can naturally use class classes to define objects. In C language, the concept of an object may become a logical concept. If two objects exist at the same time, you must know which object the operation is intended for in the interface implementation, this process can be guaranteed by the definition of the COM interface.
The idea that the COM standard uses GUID to identify COM objects comes from the UUID (Universallz Unique Identifier) adopted by OSF (Open Software Foundation). UUID is defined as part of DCE (Distributed Computing Environment, it is mainly used to identify both sides of RPC communication.
In addition to encapsulation and reusability, C ++ objects also have an important feature of polymorphism. It is the polymorphism of C ++ objects that reflects the high abstraction of things in C ++ languages. COM objects also have polymorphism, however, such polymorphism can only be reflected through the interface of the COM object, just as the polymorphism of the C ++ object can be reflected through its (virtual) function.
From API to COM interface
If we want to implement a word processing application system, it requires a dictionary query function. According to the componentized program design method, we should naturally put the dictionary query function into a component (. dll) Program for implementation. If the dictionary search algorithm or dictionary library changes in the future, as long as the interface between the application and the component remains the same, the new component program can still be used by the application system. This is the flexibility brought about by using component programs.
To connect application systems and component programs and make them work collaboratively, the simplest way is to first define a group of dictionary functions, and this group of functions should be generalized as much as possible, do not add specific dictionary-related knowledge.
Function
Function Description
Initialize
Initialization
LoadLibrary
Load dictionary Library
InsertWord
Insert a word
DeleteWord
Delete a word
LookupWord
Search for words
RestoreLibrary
Store the dictionary in the memory into the specified file.
FreeLibrary
Release dictionary Library
The flat API interface layer can be used to connect two programs, but the following problems exist:
(1) When there are many API functions, they are inconvenient to use and need to be organized.
(2) API functions must be standardized and processed in a unified call method to adapt to programming implementation in different languages. The parameter transmission sequence, parameter type, and response processing must be standardized.
COM defines a complete set of interface specifications, which can not only make up for the shortcomings of the above APIs as a component excuse, but also give full play to the advantages of component objects and realize the polymorphism of component objects.
Interface Definition and ID
Technically, an interface contains the data structure of a group of functions. Through this data structure, the customer code can call the functions of component objects. The interface defines a group member function. This group member function is all the information exposed by the component object, and the client program uses the services of these functions or component objects.
The client calls the interface member function by a pointer to the interface data organization. The interface pointer actually points to another pointer. The second Pointer Points to a group of functions, called the interface function table (virtual function table ), each item in the interface function table is a four-byte function pointer. Each function pointer is connected to the specific implementation of the object. In this way, the customer can call the actual functions of the object as long as the interface pointer is obtained.
For an interface, its virtual function table vtable is determined, so the number of member functions of the interface is unchanged, and the sequence of member functions is also unchanged; for each member function, its parameters and return values are also determined.
In the definition of an interface, all the information must be determined at the binary level. No matter what language, as long as such a memory structure description is supported, that is, it can support the "structure" or "record" type, And this type can contain dual members pointing to the function finger table, then it can support the interface description, it can be used to compile COM components or use COM components.
Interface Description Language (IDL)
Based on the osf dce specification used to describe the remote call interface IDL, the COM specification is extended to form the description language of the COM interface.
The IDL Interface Description Language used by the COM standard can be used not only to define COM interfaces, but also to define some common data types and user-defined data structures. For interface member functions, we can specify the type of each parameter, the input and output features, and even the description of an array of variable length. IDL supports pointer types, similar to C/C ++.
Microsoft Visual C ++ provides the MIDL tool to compile the IDL Interface Description file into a C/C ++ compatible interface description header file (. h ).
IUnknown definition (IDL ):
Interface IUnknown
{
HRESULT QueryInterface ([in] REFIID iid, [out] void ** GMM );
ULONG AddRef (void );
ULONG Release (void );
}
Definition of IUnknown (C ++ ):
Class IUnknown
{
Public:
Virtual HRESULT _ stdcall QueryInterface ([in] REFIID iid, [out] void ** GMM) = 0;
Virtual ULONG _ stdcall AddRef (void) = 0;
Virtual ULONG _ stdcall Release (void) = 0;
}
In-process Components
Because the internal components and the client program run in the same process address space, once the Client Program establishes a communication relationship with the component program, the interface pointer obtained by the client program points to the vtable of the interface in the component program. This vtable contains all the member function addresses. The customer code can directly call these member functions, so the efficiency is very high.
Because the DLL program is loaded into the memory by the customer at runtime, the DLL module itself is also independent and does not depend on the customer program.
In C ++, to make the compiled DLL program more universal, the _ stdcall call habit is generally used to specify the DLL extraction function. If the _ cdecl call habit is used, some Programming Language Environments cannot use these DLL programs. The C ++ compiler generates a modifier for each derivative function of the DLL program. These modifier names are not compatible with different compilers. Therefore, from the perspective of universality, we add extern before each function definition? C "specifier. In the Visual C ++ development environment, the following statements can be used to describe a function:
Extern? C "int _ stdcall MyFunction (int n );
To compile the DLL program, follow these steps:
(1) create a DLL Project
(2) Use extern? C "specifier and _ stdcall modifier, as described above in MyFunction.
(3) According to the traditional programming method, we should also compile a DEF file to describe the module information of the DLL program. On the Win32 platform, we can directly use the _ declspec (dllexport) specifier instead of using the DEF file. For example:
Extern? C "_ declspec (dllexport) int _ stdcall MyFunction (int n );
The DLL module created in this way can be called by other programs, because the C ++ connector connects all information about the function to the final target code.
From the client program side, there are three system functions that can be used to operate DLL programs, LoadLibrary, GetProcAddress, and FreeLibrary.
Generally, the use of DLL programs follows these steps:
First, the client program uses the LoadLibrary function to load the DLL. This function returns the instance handle of the module for future operations on this module.
Then, the client program can call the GetProcAddress function to obtain the address of the function derived from the DLL, which can be specified by the function serial number (in the DEF file) you can also obtain the address of the function according to the function name. Because the client program and DLL program are in the same memory address space, the client program can directly call these function.
Finally, FreeLibrary unloads the DLL program out of memory to release resources.
Note:
(1) The DLL program can not only introduce functions, but also global variables. Because the client program and the DLL program are in the same address space, it makes sense to export the global variables in the DLL to the client program. The referenced method is not complex. You can place the variable name in the EXPORTS section of the DEF file and add the DATA option. Alternatively, you can add the _ declspec (dllexport) specifier before the variable description.
(2) DumpBin lists all extracted information in the DLL program through the/Exports option.
(3) The client program itself can also be a DLL program, but it must be first loaded into the process space so that system function operations can be called as the DLL module of the service program.
Out-of-process Components
Because the external component programs and client programs are located in different process spaces and they use different address spaces, communication between components and customers must span the process boundary, this involves the following issues:
(1) How does one process call functions in another process?
(2) How are parameters transmitted from one process to another?
On Windows, there are many ways to communicate between different processes, including DDE, named pipe, or shared memory. COM uses LPC (Local Procedure Call) and RPC (Remote Procedure Call)
RegEdit can check the COM object under the CLSID subkey (63 pages)
Microsoft Visual c000000000000oleview.exe, which lists all categories on the current machine and component objects under each category.
RegSvr32 D:/DicComp/DictComp. dll
RegSvr32/u D:/DicComp/DictComp. dll
The DLL component must have two entry functions for registration: DllRegisterServer and DllUnregisterServer to be registered with RegSvr32.
COM requires that non-process components that support self-registration must support two command line parameters:/RegServer and/UnregServer to complete registration or cancellation.
Class Factory
In fact, the client program does not directly call the component program's function. It calls the function of the COM library to create component objects, the COM Library Creation function calls the component program entry function based on the registry information to create component objects. The component program needs to provide a standard entry function DLLGetObjectClass to provide the component information of the group program.
Class factory and dllgetobjectclass Functions
A class factory is the production base for COM objects. The com library creates COM objects through the class factory. For each COM class, a class factory is used to create objects for this class. The factory itself is also a COM object, which supports a special interface: iclassfactory, which is defined as follows:
Class iclassfactory: Public iunknown
{
Virtual hresult _ stdcall createinstance (iunknown * punknownouter, const IID & IID, void ** GMM) = 0;
Virtual hresult _ stdcall lockserver (bool block) = 0;
};
The interface iclassfactory has an important member function createinstance, which is used to create the corresponding COM object. Because each class factory is specific to a specific COM class object, the createinstance member function knows what kind of COM object to create. In the parameters of the createinstance member function, the first parameter punknownouter is used when an object is aggregated. If no aggregate is set to null. The lockserver, another member function of iclassfactory, is used to control the lifecycle of a component.
Because the class factory itself is also a COM object and it is used to create other COM objects, who will create the class factory objects? The answer is dllgetclassobject. The dllgetclassobject function is not a com library function, but a derived function implemented by the component program. Let's take a look at the prototype of the dllgetclassobject function first:
Hresult dllgetclassobject (const CLSID & CLSID,
Const IID & IID,
(Void **) GMM
);
After receiving the command to create an object, the COM library calls the DLLGetClassObject function of the component in the process to create a class factory object and return the interface pointer of the class factory object, once the COM library or customer has the interface pointer of the class factory, they can create the corresponding COM object through the CreateInstance member function of the class factory interface IClassFactory.
Interaction between COM library and class factory (67 pages)
In the COM library, there are three API functions that can be used to create objects: CoGetClassObject, CoCreateInstance, and CoCreateInstanceEx. Generally, a customer program calls one of them to create an object and return the initial interface pointer of the object. The COM Library also interacts with the class factory through these three functions.