Item 4: Make sure that the objects (object) is initialized before use
By Scott Meyers
Translator: fatalerror99 (itepub's nirvana)
Release: http://blog.csdn.net/fatalerror99/
C ++ seems unpredictable in the initialization of object values. For example, if you do this,
Int X;
In some cases, X is initialized (0), but it may not exist in other cases. If you do this,
Class Point {
Int X, Y;
};
...
Point P;
P's data members (data members) are sometimes initialized (0), but sometimes not. If you come to C ++ from a language without uninitialized objects (Uninitialized Object), pay attention to this problem because it is very important.
Reading an uninitialized values (uninitialized value) will cause undefined behavior (undefined behavior ). On Some platforms, reading an uninitialized value (uninitialized value) will cause program suspension, it is more likely that semi-random bits (semi-random binary bit) at the location where you read will eventually lead to unpredictable program behavior and annoying debugging.
At present, there are some rules that describe when object initialization (object initialization) is guaranteed and when it cannot be guaranteed. Unfortunately, these rules are complicated-I think they are too complicated to remember. Generally, if you use C ++'s C section (see item 1), and initialization (initialization) may take some running time, it cannot be guaranteed. If you use the non-C part of C ++, things will change. This is why an array (array) (C part from C ++) cannot ensure that its elements are initialized, but a vector (STL part from C ++) to ensure that.
The best way to deal with this kind of problem is to always initialize your objects before they are used. For non-member objects (non-member objects) of built-in types (built-in type), you need to do this manually. For example:
Int x =0; // Manual initialization of an int
Const char * text ="A c-style string"; // Manual initialization of
// Pointer (see also item 3)
Double D; // "initialization" by reading from
STD: CIN> d; // An input stream
In almost all other cases, the importance of initialization (initialization) falls on Constructors (constructors. The rule here is simple: make sure all Constructors (all constructors) initialize everything in the object.
This rule is easy to follow, but it is important not to mix assignment (Value assignment) with initialization (initialization. Consider the following Constructor (constructor) that represents the class (class) of the address book entry ):
Class phonenumber {...};
Class abentry {// abentry = "address book entry"
Public:
Abentry (const STD: string & name, const STD: string & address,
Const STD: List <phonenumber> & phones );
PRIVATE:
STD: String thename;
STD: String theaddress;
STD: List <phonenumber> thephones;
Int num timesconsulted;
};
Abentry: abentry (const STD: string & name, const STD: string & address,
Const STD: List <phonenumber> & phones)
{
Thename = Name; // These are allAssignments,
Theaddress = address; // Not initializations
Thephones = phones;
Numtimesconsulted = 0;
}
Although this makes the abentry objects (object) have the expected value, it is not the best practice. The C ++ rule specifies that the data members (data member) of an object is initialized before it enters the constructor (constructor) function body. In the constructor (constructor) of abentry, thename, theaddress and thephones are not being initialized (initialized), but beingAssigned(Assigned ). Initialization (initialization) occurs earlier -- before entering the constructor (constructor) function body of the abentry, their Default constructors (Default constructors) have been automatically called. It does not include numtimesconsulted because it is a built-in type (built-in type ). It cannot be ensured that it is initialized before being assigned a value.
A better way to write abentry Constructor (constructor) is to use Member initialization list (member initialization list) instead of assignments (assignment ):
Abentry: abentry (const STD: string & name, const STD: string & address,
Const STD: List <phonenumber> & phones)
:Thename (name),
Theaddress (address), // These are now allInitializations
Thephones (Phones),
Numtimesconsulted (0)
{} // The ctor body is now empty
The final result of this Constructor (constructor) is the same as the previous one, but it usually has a higher efficiency. The assignment-based version will first call Default constructors (default construtors) to initialize thename, theaddress and thephones, but soon it will be in default-constructed (Default Construction) value. The work done by default constructions (default constructor) is wasted. The member initialization list (member initialization list) method avoids this problem, because the arguments (parameter) in the initialization list (initialization list) can be used as various data members (data members) the arguments (parameter) used by Constructor (constructor ). In this case, thename is copied from the name copy-constructed (copy structure), theaddress is copied from the address-constructed (copy structure ), thephones from phones copy-constructed (copy construction ). For most types, calling copy constructor only once is more efficient than calling default constructor once (default constructor) and then copying assignment operator once) high efficiency (sometimes much higher ).
For objects (objects) of built-in type (built-in type) such as numtimesconsulted, initialization (initialization) and assignment (assignment) are no different, but for uniformity, it is best to initialize everything through member initialization (member initialization. Similarly, you can also use the member initialization list (member initialization list) when you only want to use the default-construct (Default Construction) One data member (data member ), you do not need to specify the initialization argument (initialization parameter. For example, if abentry has a constructor (constructor) that does not obtain parameters (parameters), it can be implemented as follows:
Abentry: abentry ()
: Thename(), // Call thename's default ctor;
Theaddress(), // Do the same for theaddress;
Thephones(), // And for thephones;
Numtimesconsulted (0) // but explicitly initialize
{}// Numtimesconsulted to zero
Because user-defined types (user-defined type) data members (data members) that do not have initializers in the member initialization list (member initialization List ), the compiler automatically calls its Default constructors (Default constructors), so some programmers may think that the above method is too much. This is not hard to understand, but one principle is that every data member (data member) is always listed in the initialization list ), this avoids the need to recall which data members (data member) may not be initialized once an omission occurs. For example, because numtimesconsulted is a built-in type (built-in type), if you delete it from the member initialization list (member initialization list), it is undefined behavior (undefined behavior) opens the door to convenience.
Sometimes, even built-in types (built-in type), initialization list (initialization list) is required. For example, const or references (reference) data members (data members) must be initialized (initialized) and cannot be assigned (assigned) (see item 5 ). To avoid remembering when data members (data members) must be initialized in the member initialization list (member initialization list), and when it is optional, the simplest way is to always use the initialization list (initialization list ). It is sometimes necessary, and it is generally more efficient than assignments (assignment.
Many classes (classes) have multiple constructors, and each Constructor (constructor) has its own member initialization list (member initialization list ). If there are many data members (data members) and/or base classes (base classes), the exponentially increasing initialization lists (initialization list) causes depressing duplication (in the List) and bored (in programmers ). In this case, you can delete the data members (data member) projects that work the same way as true initialization (true initialization) from the list, instead, assignments is moved to a separate (private) function for all constructors to call. This method is particularly helpful for data members (data members) whose true initial values (true initial values) are read from files or retrieved from databases. However, in general, the ratio of true member initialization (true member initialization) (via an initialization list (initialization list) to assignment (Value assignment) to perform pseudo-initialization (false initialization.
C ++ is not an unpredictable aspect. It is the sequence in which the data of an object is initialized. This order is always the same: base classes (base class) is initialized before Derived classes (derived class) (see item 12), within a class, data members (data members) are initialized in the declared order. For example, in abentry, thename is always initialized first, theaddress is the second, thephones is the third, and numtimesconsulted is the last. This is true even if they are arranged in a different order in the member initialization list (member initialization list. To avoid confusion among readers and the possibility of errors caused by ambiguous behaviors, the order of members (members) in the initialization list should always be consistent with those in the class) are declared in the same order.
Once the Explicit initialization of the non-member objects (non-member object) of built-in types (built-in type) is processed, make sure that your Constructors (constructors) use member initialization list (member initialization List) to initialize its base classes (base class) and data members (data member), so there is only one thing to worry about. That is -- take a deep breath first -- defines the initialization (initialization) Order of non-local static objects (non-local static objects) in different translation units (conversion unit.
Let's split the phrase one by one.
OneStatic objectThe lifetime of a static object starts from its creation and ends with the completion of the program. Stack and heap-based objects (stack-based objects) are excluded. Including global objects (Global Object), objects defined at namespace scope (objects defined within the namespace range), objects declared static inside classes (objects declared as static within the class ), objects declared static inside functions (declared as a static object inside the function) and objects declared static at File Scope (declared as a static object within the file range ). Static objects inside functions (static objects inside the function)Local static objects(Local static object) (because it is partial to the function) is known, other static objects (static object)Non-local static objects(Non-local static object) is known. When the program ends, static objects (static objects) is automatically destroyed, that is, its Destructors (destructor) is automatically called when the main program stops running ).
OneTranslation Unit(Conversion unit) is the source code (source code) that can form a separate object file (target file ). Basically, it is a separate source file (source file), plus all its # include files.
We are concerned about the following: including at least two source files compiled separately, each of which contains at least one non-local static object (non-local static object) (that is, global, at namespace scope, static in a class, or at File Scope (object )). The actual problem is as follows: if one of the translation units (conversion unit) has a non-local static object (non-local static object) initialization (initialization) the non-local static object (non-local static object) in another translation unit is used, and the object (object) It uses may not be initialized, becauseThe relative order of initialization of non-local static objects defined in different translation units is undefined(The initialization sequence of non-local static objects defined in different conversion units is not defined ).
An example can help us. Suppose you have a filesystem class (class) that can make files on the Internet look like they are local. Because your class makes the world look like there is only one single file system (File System), you can use either global or namespace) create a special object within the scope to represent this separate file system ):
Class filesystem {// from your library
Public:
...
STD: size_t numdisks () const; // One Of Your member functions
...
};
Extern filesystem TFs; // object for clients to use;
// "TFs" = "the file system"
A filesystem object (object) is absolutely important, so using it before the TFS object is created will suffer heavy losses. (The original text in this paragraph and the next paragraph is incorrect. It is changed according to the author's website Errata-Translator's note .)
Now, assume that some customers have created a class for the directory in a file system (File System), and their class uses TFs object (object ):
Class directory {// created by library Client
Public:
Directory (Params);
...
};
Directory: Directory (Params)
{
...
STD: size_t disks =TFS. numdisks (); // Use the TFS object
...
}
Furthermore, assume that the customer decides to create a separate directory object for the temporary file ):
Directory tempdir (Params); // Directory for temporary files
Now the importance of initialization order becomes obvious: Unless TFs is initialized before tempdir, the constructor (constructor) of tempdir will try to use it before TFs is initialized. However, TFS and tempdir are created by different people at different times in different source files (source files)-they are defined in Different Translation Units (conversion unit) non-local static objects (non-local static object) in ). How can you ensure that TFs will be initialized before tempdir?
You cannot. Reapply,The relative order of initialization of non-local static objects defined in different translation units is undefined(The initialization sequence of non-local static objects defined in different conversion units is not defined ). There is a reason for this. It is difficult and difficult to determine the "proper" initialization sequence of non-local static objects (non-local static objects. In the most common form-multiple translations units (conversion unit) and non-local static objects (non-local static objects) are instantiated through implicit template instantiations (implicit template instantiation) generation (which may also be caused by implicit template instantiations (implicit template instantiation)-not only is it impossible to determine a correct initialization (initialization) Order, it is not even worth looking for special cases that may determine the correct order.
Fortunately, a small design change fundamentally solves this problem. All you need to do is move each non-local static object (non-local static object) to its own function, where it is declared as static ). These functions return references to the objects they contain. You can call these functions instead of directly involving those objects (objects ). In other wordsLocalStatic objects replaces non-local static objects (non-local static object ). (Aficionados of design patterns (fans of design patterns) will recognize this is a general implementation of the Singleton pattern ). (In fact, this is only part of a singleton implementation. The core part of Singleton that I ignore in this item is to prevent the creation of multiple objects of a specific type .) (The original text in the brackets does not exist. Refer to the author's website survey to add.-Translator's note .)
This method is established in C ++ to ensure that the local static objects (local static object) initialization occurs when the definition (Definition) of the object (object) is first encountered because the function is called. Therefore, if you replace the non-local static objects (non-local static object) method by calling the function that returns references to local static objects (reference of local static object, make sure that the retrieved references (references) are directed to the initialized objects (initialized object ). As an out-of-share benefit, if you never call such a function that imitates a non-local static object (non-local static object), you will not create or destroy this object) but a true non-local static objects (real non-local static object) will not have this effect.
The following are the applications of this technology on TFs and tempdir:
Class filesystem {...}; // as before
Filesystem & TFS ()// This replaces the TFS object; it cocould be
{// Static in the filesystem class
Static filesystem FS; // Define and initialize a local static object
Return FS; // Return a reference to it
}
Class directory {...}; // as before
Directory: Directory (Params) // As before, alias t references to TFs are
{// Now to TFS ()
...
STD: size_t disks = TFs(). Numdisks ();
...
}
Directory & tempdir ()// This replaces the tempdir object; it
{// Cocould be static in the Directory class
Static directory TD; // Define/initialize local static object
Return TD; // Return reference to it
}
Customers of this improved system can still program in the way they are used to, but now they should replace TFs and tempdir With TFS () and tempdir. In other words, they should use the function that returns references to objects (Object Reference) instead of objects themselves (object itself ).
To write reference-returning functions (return referenced function), follow these steps: define and initialize a local static object (local static object) in row 1st and return it in row 2nd. Such simplicity makes them perfect candidates for inlining, especially when they are frequently called (see item 30 ). On the other hand, the fact that these functions contain static objects causes problems in multithreaded systems. Furthermore, any type of non-const static object (non-massive static object) -- local (local) or non-local (non-local) -- in multiple threads (multi-thread) there will be troubles in some scenarios. One of the ways to solve this problem is to manually call all reference-returning functions (return the referenced function) at the startup of the single-threaded (single thread) of the program ). This avoids the chaotic environment of initialization-related (related to initialization.
Of course, the idea of using reference-returning functions (returning referenced functions) to prevent initialization order problems (initialization order problem) is first dependent on your objects (object) there is a reasonable initialization Order (initialization order ). If you have a system in which object A must be initialized before object B, but the initialization of object a depends on that of object B has been initialized, you will encounter problems. To be honest, you are in serious trouble. However, if you avoid this situation, the method described here will serve you well, at least in single-threaded applications.
To avoid using objects (objects) before initialization, you only need to do three things. First, manually initialize the non-member objects (non-member object) of the built-in types (built-in type ). Second, use Member initialization lists (member initialization List) to initialize all parts of an object. Finally, the uncertainty of non-local static objects (non-local static object) initialization Order (initialization order) defined in the separated translation units (conversion unit) is bypassed in the design.
Things to remember
- Manually initialize built-in type (built-in type) objects (object), because C ++ will only initialize them at some time.
- In Constructor (constructor), use Member initialization list (member initialization list) instead of assignment (Value assignment) in the function body ). The data members (data members) in the initialization list should be sorted in the same order as they are declared in class.
- Use local static objects (local static object) instead of non-local static objects (non-local static object) to avoid cross-Translation Units (conversion unit) (initialization order problems ).