Cla34: reduce the compilation dependency between files to a minimum self-defined tive C ++ Second Edition)

Source: Internet
Author: User

Assume that one day you open your c ++ program code and make minor changes to the implementation of a class. Remind you that the changes are not the interface, but the implementation of the class, that is, they are only the details. Then you are ready to regenerate the program, thinking that the compilation and link should only take a few seconds. After all, I just changed a class! So you click "rebuild" or enter make (or other similar commands ). However, what is waiting for you is stunned, followed by pain. Because you found that the whole world is being re-compiled and re-linked!

When all this happens, are you just angry?

The cause of the problem is that C ++ is not doing well in separating interfaces from implementations. In particular, the class definition of C ++ contains not only interface specifications, but also many implementation details. For example:

Class person {
Public:
Person (const string & name, const Date &
Birthday,
Const address & ADDR, const country & country );
 
Virtual ~ Person ();

... // For simplicity, the copy structure is omitted.
//
Functions and value assignment operators
String name () const;
String birthdate () const;
String
Address () const;
String nationality () const;

PRIVATE:
String name _; // Implementation Details
Date birthdate _;
// Implementation Details
Address _; // Implementation Details
Country citizenship _;//
Implementation Details
};

This is hard to be said to be a very clever design, although it shows a very interesting naming method: when private data and public functions want to be identified by a name, the former can be distinguished by an underscore at the end. The important thing to note here is that the implementation of person uses some classes, that is, string,
To compile date, address, and country; person, you must allow the compiler to access the definitions of these classes. This definition is generally provided through the # include command. Therefore, when defining the file header of the person class, you can see the following statement:

# Include <string> // used for the string type (see clause 49)
# Include
"Date. H"
# Include "address. H"
# Include "country. H"

Unfortunately, a compilation dependency is established between the file defining person and these header files. Therefore, if any helper class (that is, string,
Date, address, and country) change its implementation, or the class on which any helper class depends changes the implementation, files that contain the person class and any files that use the person class must be re-compiled. This is really annoying for people users, because in this case, users are absolutely helpless.

Then, you must wonder why C ++ must put the implementation details of a class in the class definition. For example, why can't I define person as follows to separate the Implementation Details of classes?

Class string; // "Conceptually" declare the string type in advance
//
For details, refer to clause 49.

Class date; // declare it in advance
Class address; // declare in advance
Class
Country; // declare in advance

Class person {
Public:
Person (const string & name, const Date &
Birthday,
Const address & ADDR, const country & country );
 
Virtual ~ Person ();

... // Copy the constructor, operator =

String name () const;
String birthdate () const;
String address ()
Const;
String nationality () const;
};

If this method is feasible, unless the class interface changes, the person
Users do not need to re-compile. During the development of a large system, the interface tends to be fixed before the implementation of the class is started. Therefore, the separation of this interface and implementation will greatly save the time for re-compilation and linking.

Unfortunately, the reality is always in conflict with the ideal. you will agree with this point by looking at the following:

Int main ()
{
Int X; // defines an int.

Person P (...); // defines a person

// (To simplify the omitted parameters)
...

}

When we see the definition of X, the compiler knows that it must be allocated an int size of memory. No problem. Every compiler knows how big an int is. However, when we see the definition of P, although the compiler knows that it must allocate a person-sized memory to it, how can we know the size of a person object? The only way is to use the definition of the class, but if the definition of the class can legally omit the implementation details, how does the compiler know How much memory should be allocated?

In principle, this problem is not difficult to solve. Some languages such as Smalltalk, Eiffel, and Java are dealing with this problem every day. When defining an object, they allocate only enough space to hold a pointer to the object. That is to say, corresponding to the above Code, they do as follows:

Int main ()
{
Int X; // defines an int.

Person * P; // defines a person pointer.

...
}

You may have encountered such code before, because it is actually a legal C ++ statement. This proves that programmers can "hide the implementation of an object behind the pointer" by themselves ".

The following describes how to use this technology to implement the separation of the person interface and implementation. First, put down the following content in the header file that declares the person class:

// The Compiler still needs to know these type names,
// Because the constructors of person need to use them
Class string ;//
This is incorrect for the standard string,
// For the reason, see clause 49.
Class date;
Class
Address;
Class country;

// The class personimpl will contain the entity of the person object
// Details. This is only an advance declaration of the class name.
Class personimpl;

Class person {
Public:
Person (const string & name, const Date &
Birthday,
Const address & ADDR, const country & country );
 
Virtual ~ Person ();

... // Copy the constructor, operator =

String name () const;
String birthdate () const;
String address ()
Const;
String nationality () const;

PRIVATE:
Personimpl * impl; // point to a specific implementation class
};

Now the user program of person is completely separated from the Implementation Details of string, date, address, country and person. Those classes can be modified at will, but the users of person are happy to leave alone. More specifically, they do not need to be re-compiled. In addition, because you cannot see the Implementation Details of person, you cannot write code that depends on these details. This is the real separation of interfaces and implementations.

The key to separation is that "dependency on class definitions" is "dependency on class Declaration"
Replaced. Therefore, to reduce compilation dependencies, we only need to know this one. If possible, try to prevent header files from relying on other files. If not, use the class declaration, do not rely on the definition of classes. All other methods are derived from this simple design idea.

The following is the meaning of this idea after direct deepening:

· If you can use object references and pointers, avoid using the object itself. Defining a type of reference and pointer only involves the declaration of this type. Objects of this type must be defined.

· Use the Declaration of classes as much as possible without the definition of classes. When declaring a function, if a class is used, the class definition is absolutely not required, even if the function passes and returns the class by passing the value:

Class date; // class declaration

Date returnadate (); // correct ---- date definition not required
Void
Takeadate (date D );

Of course, passing a value is usually not a good idea (see article 22), but do not cause unnecessary compilation dependencies when you have to do so for any reason.

If you are surprised that the returnadate and takeadate statements do not require the date definition during compilation, please take a look at the following with me. In fact, it does not seem so mysterious, because anyone calls those functions will make the definition of date visible. "Oh"
I know you're thinking, "Why bother declaring a function that nobody calls? "
No! It is not called by no one, but by no one. For example, if there is a library containing hundreds of function declarations (multiple namespaces may be involved-See Clause 28), it is impossible for each user to call each function. Class definition will be provided (through # include
Command) tasks are transferred from your function declaration header file to user files that contain function calls, which can eliminate user dependencies on Type Definitions, this dependency is inherently unnecessary and artificially caused.

·
Do not include other header files in header files (using the # include command), unless they are missing, they cannot be compiled. On the contrary, the required classes should be declared one by one, so that the user using this header file (using the # include command) can include other header files, so that the user code can finally be compiled. Some users complain that this is inconvenient for them, but in fact you have avoided a lot of suffering for them. In fact, this technology is highly respected and applied to the C ++ standard library (see article 49; the header file <iosfwd> contains the type declaration (and only the type declaration) in the iostream library ).

The person class uses only one pointer to point to an uncertain implementation. Such classes are often called the handle class or the envelope class.
Class ). (For the classes they point to, the corresponding name in the previous case is the body class; in the latter case, it is called the letter class (letter ).
Class ).) Sometimes someone calls this kind of cat "Cheshire Cat". This is the cat in Alice's wonderland. When it is willing, it will make other parts of the body disappear, just smile.

You will be curious about what Bing class has actually done. The answer is simple: it only transfers all function calls to the corresponding main class, and the main class truly completes the work. For example, the following is the implementation of two member functions of person:

# Include "person. H "//
Because the person class is implemented,
// Therefore, the class definition must be included.

# Include "personimpl. H "//
It must also contain the definition of the personimpl class,
//
Otherwise, you cannot call its member functions.
//
Note that personimpl and person contain the same
// Member functions, whose interfaces are identical

Person: Person (const string & name, const Date &
Birthday,
Const address & ADDR, const country &
Country)
{
Impl = new personimpl (name, birthday, ADDR,
Country );
}

String person: Name () const
{
Return impl-> name ();
}

Note how the person constructor calls the personimpl Constructor (implicitly called with new, refer to Clause 5 and M8) and how the person: Name calls personimpl: Name. This is important. Making Person A handle class does not change the behavior of the person class, but only the location where the behavior is executed.

In addition to the handle class, the other choice is to make person a special type of abstract base class, called protocol
Class ). According to the definition, the Protocol class is not implemented; its purpose is to determine an interface for the derived class (see clause 36 ). Therefore, it generally has no data member and no constructor; there is a virtual destructor (see clause 14) and a set of pure virtual functions for interface formulation. The Protocol class of person looks like the following:

Class person {
Public:
Virtual ~ Person ();

Virtual string name () const = 0;
Virtual string birthdate () const =
0;
Virtual string address () const = 0;
Virtual string nationality ()
Const = 0;
};

The user of the person class must use it through the pointer and reference of the person, because it is impossible to instantiate a class containing pure virtual functions (however, the derived class of the person class can be instantiated-see below ). Like a handle user, a protocol user only needs to re-Compile when the class interface is modified.

Of course, there is a way for protocol users to create new objects. This is often achieved by calling a function, which plays the role of the constructor, and the class where the constructor is located is the class that is actually instantiated and hidden behind the derived class. This type of function is called quite a lot (such as factory function
Function), virtual Constructor (virtual
Constructor), but the behavior is the same: returns a pointer pointing to a dynamic allocation object that supports protocol-class interfaces (see section M25. Such a function is declared as follows:

// Makeperson supports the person Interface
// "Virtual constructor" ("factory function") of the object ")
Person *
 
Makeperson (const string & name, // initialize
Const
Date & birthday, // new person object, and then
Const address &
ADDR, // returns the object pointer
Const country & country );

The user uses it like this:

String name;
Date dateofbirth;
Address;
Country nation;

...

// Create an object that supports the person Interface
Person * PP = makeperson (name, dateofbirth, address,
Nation );

...

Cout <PP-> name () // use the object through the person Interface
<
"Was born on"
<PP-> birthdate ()
<
"And now lives"
<PP-> address ();

...

Delete pp; // delete an object

Functions such as makeperson are closely related to the Protocol class corresponding to the object it creates (the object supports the interface of this Protocol class, so it is a good habit to declare it as a static member of the Protocol class:

Class person {
Public:
... // Same as above

// Makeperson is a member of the class.
Static person * makeperson (const string &
Name,
Const Date &
Birthday,
Const address &
ADDR,
Const country & country );

This will not cause confusion to the global namespace (or any other namespace), because there will be a lot of functions of this nature (see clause 28 ).

Of course, a specific class that supports Protocol class interfaces (Concrete
Class) must be defined, and real constructors must also be called. They all occur in the implementation file. For example, the Protocol class may have a derived concrete class realperson, which implements the inherited virtual function:

Class realperson: public person {
Public:
Realperson (const
String & name, const Date & birthday,
Const address &
ADDR, const country & country)
: Name _ (name ),
Birthday _ (birthday ),
Address _ (ADDR), country _ (country)
{}

Virtual ~ Realperson (){}

String name () const; // the specific implementation of the function is not
String birthdate ()
Const; // here, but they
String address () const; // both are easy to implement
String
Nationality () const;

PRIVATE:
String name _;
Date birthday _;
Address _;
 
Country country _;

With realperson, writing person: makeperson is a piece of cake:

Person * person: makeperson (const string &
Name,
Const Date &
Birthday,
Const address &
ADDR,
Const country & country)
{
 
Return new realperson (name, birthday, ADDR, country );
}

The implementation protocol class has two most common mechanisms. realperson shows one of them: first inherit the interface specification from the Protocol Class (person), and then implement the functions in the interface. Another mechanism for implementing the Protocol Class involves multi-inheritance, which will be the topic of Clause 43.

Yes, the handle class and Protocol class separate the interface and implementation, thus reducing the dependency on file compilation. "But what are the costs of all these tricks? ", I know you are waiting for the ticket to arrive. The answer is the most common sentence in the computer science field: it consumes more time and memory at runtime.

In case of a handle class, the member function must use the pointer (pointing to the implemented) to obtain the object data. In this way, the indirect nature of each access is more than one layer. In addition, this pointer should be used to calculate the memory size occupied by each object. Also, the pointer itself needs to be initialized (within the constructor of the handle class) to point to the dynamically allocated implementation object, it also bears the overhead caused by dynamic memory allocation (and subsequent memory release ).
---- See clause 10.

For protocol classes, each function is a virtual function, and all calls to the function must bear the overhead of indirect jump (see terms 14 and M24 ). Moreover, each object derived from the Protocol Class must contain a virtual pointer (see terms 14 and M24 ). This pointer may increase the amount of memory required by OSS (depending on whether the Protocol Class is their unique source for object virtual functions ).

At last, the handle class and Protocol class do not usually use inline functions. When using any inline function, you need to access implementation details. The original intention of designing the handle class and Protocol class is to avoid this situation.

But if we leave the handle class and Protocol Class alone in the cold room because of the overhead, it would be a big mistake. Just like virtual functions, do you need them? (If no answer is needed, you are reading a book you shouldn't read !) On the contrary, these technologies should be used in a developmental perspective. Minimize the number of handle classes and Protocol classes during development.
"Implementation"
The negative impact of changes on users. If the increase in speed and/or volume is much greater than the decrease in dependency between classes, then when the program is converted into a product, it replaces the handle class and Protocol class with a specific class. One day, we hope a tool will automatically execute such conversions.

Some people also like to mix handle classes, protocol classes, and specific classes, and use them very skillfully. This makes the developed software system run efficiently and easy to improve, but there is a major drawback: It is necessary to find a way to reduce the time consumed during program re-compilation.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.