Chen Shuo (giantchen_AT_gmail)
Http://blog.csdn.net/Solstice http://weibo.com/giantchen
Chen Shuo on C ++ engineering practices of a series of articles: http://blog.csdn.net/Solstice/category/802325.aspx
Normal typographical version: http://www.cnblogs.com/Solstice/category/287661.html
Chen Shuo blog collection download: http://blog.csdn.net/Solstice/archive/2011/02/24/6206154.aspx
This document uses the "Creative Commons signature-non-commercial use-deduction of the 3.0 Unported License Agreement (cc by-nc-nd. Http://creativecommons.org/licenses/by-nc-nd/3.0/
This article is a follow-up of the previous article "C ++ Engineering Practice (7): iostream's purpose and limitations". In this article, "iostream interacts with other components of the standard library" is introduced, I simply mentioned that the iostream object has different semantics with other objects in the C ++ standard library (mainly containers and strings), mainly because iostream cannot be copied or assigned values. Today I will give a full talk about my understanding of this issue.
The definition of "object" in this article is relatively broad. a region of memory that has a type. In this definition, int, double, and bool variables are all objects.
What is value semantics?
Value sematics indicates that copying an object is irrelevant to the original object, just like copying an int. The built-in types of C ++ (bool/int/double/char) are both value semantics, complex <>, pair <>, vector <>, map <>, string, and Other types in the Standard Library are also value semantics. After copying them, they are separated from the original object. The primitive types in Java is also the value semantics.
The value semantics corresponds to "object semantics/object sematics", or "reference sematics". Because the word "reference" has a special meaning in C ++, so I use the term "object semantics" in this article. Object semantics refers to objects in the object-oriented sense, and object copying is prohibited. For example, the Thread in muduo is the object semantics, and copying a Thread is meaningless and forbidden: Because the Thread represents the Thread, copying a Thread object does not allow the system to add an identical Thread.
By the same principle, copying an Employee object is meaningless. An Employee will not become two employees and will not receive two salaries. It doesn't make sense to copy the TcpConnection object. There is only one TCP connection in the system. Copying the TcpConnection object won't let us have two connections. Printer cannot be copied either. The system only connects to one Printer. Copying Printer cannot add a Printer out of thin air. In general, the "object" in the object-oriented sense is non-copyable.
The class objects in Java are both object semantics and reference semantics. ArrayList <Integer> a = new ArrayList <Integer> (); ArrayList <Integer> B = a; then a and B point to the same ArrayList object, modifying a also affects B.
The value semantics is irrelevant to immutable. Java has a value object. According to the definition of (PoEAA 486), it is actually an immutable object, such as String, Integer, BigInteger, and joda. time. dateTime and so on (Because Java cannot implement the true value semantic class, it has to use immutable object for simulation ). Although immutable object has its own usage, it is not the topic of this article. The Date and Timestamp in muduo are also immutable.
Value semantic objects in C ++ can also be mutable, such as complex <>, pair <>, vector <>, map <>, and string. Both InetAddress and Buffer of muduo have value semantics, which can be modified.
The value semantics object is not necessarily a POD. For example, string is not a POD, but it is a value semantics.
The object of the value semantics is not necessarily small. For example, the element of the vector <int> can be multiple or fewer, but it is always the value semantics. Of course, many value semantics objects are small, such as complex <>, muduo: Date, muduo: Timestamp.
Value semantics and life cycle
One major benefit of value semantics is that life cycle management is simple, just like int-you don't need to worry about the life cycle of int. The value semantic object is either a stack object or directly a member of another object, so we don't have to worry about its life cycle (a function uses an object on its own stack, A member function uses its own data member object ). On the contrary, because the object of object semantics cannot be copied, we can only use it through pointers or references.
Once you use pointers and references to operate on objects, you need to worry about whether the indicated objects have been released. This was once a major source of C ++ program bugs. In addition, because C ++ can only obtain polymorphism through pointers or references, it is essential to engage in inheritance-and polymorphism-Based Object-Oriented Programming in C ++-resource management.
Consider a simple object modeling-Parent and Child: a Parent has a Child, a Child knows his/her Parent. It is very easy to write in Java, so you don't have to worry about memory leaks, or you don't have to worry about hanging pointers:
public class Parent{ private Child myChild;} public class Child{ private Parent myParent;}
As long as myChild and myParent are correctly initialized, Java programmers do not have to worry about access errors. If a handle is valid, you only need to determine whether it is non-null.
In C ++, we have to worry about resource management fees. both Parent and Child represent real people and must not be copied. Therefore, they have object semantics. Does Parent directly hold Child? Or does Parent and Child use pointers to each other? Is Child's life cycle controlled by Parent? If there are two classes, ParentClub and School, respectively representing the Parent Club and School: ParentClub has primary Parent (s), School has primary Child (ren ), so how can we ensure that they always hold valid Parent objects and Child objects? When can we safely release Parent and Child?
Direct but error-prone Syntax:
class Child;class Parent : boost::noncopyable{ private: Child* myChild;};class Child : boost::noncopyable{ private: Parent* myParent;};
If a pointer is used as a member, how can we ensure the validity of the pointer? How to prevent the occurrence of a null pointer? Who is responsible for the release of Child and Parent? When releasing a Parent object, how do I ensure that there is no pointer to it in the program? When releasing a Child object, how do I ensure that the program does not point to it?
This series of problems was once a headache in C ++ object-oriented programming, but now with smart pointer, we can use smart pointer to convert object semantics into value semantics, so as to easily solve the object life cycle: Let the Parent hold the smart pointer of the Child, and let the Child hold the smart pointer of the Parent, so that when the other party is always referenced, there is no need to worry about the emergence of a null pointer. Of course, one of the smart pointer should be weak reference, otherwise circular reference will occur, leading to memory leakage. Which is the weak reference depends on the specific application scenario.
If the Parent has Child, the Child's life cycle is controlled by its Parent, and the Child's life cycle is smaller than the Parent, the code is relatively simple:
class Parent;class Child : boost::noncopyable{ public: explicit Child(Parent* myParent_) : myParent(myParent_) { } private: Parent* myParent;};class Parent : boost::noncopyable{ public: Parent() : myChild(new Child(this)) { } private: boost::scoped_ptr<Child> myChild;};
In the above design, the Child pointer cannot be disclosed to the outside world, otherwise there may still be a blank hanging pointer.
If the life cycle of Parent and Child is independent of each other, the following problems may occur:
class Parent;typedef boost::shared_ptr<Parent> ParentPtr;class Child : boost::noncopyable{ public: explicit Child(const ParentPtr& myParent_) : myParent(myParent_) { } private: boost::weak_ptr<Parent> myParent;};typedef boost::shared_ptr<Child> ChildPtr;class Parent : public boost::enable_shared_from_this<Parent>, private boost::noncopyable{ public: Parent() { } void addChild() { myChild.reset(new Child(shared_from_this())); } private: ChildPtr myChild;};int main(){ ParentPtr p(new Parent); p->addChild();}
The above shared_ptr + weak_ptr practices seem a little tricky.
Consider a slightly more complex object model: a Child has parents: mom and dad; a Parent has one or more Child (ren); a Parent knows his/her spouser. this object model is not complex to express in Java. Garbage collection will help us solve the object life cycle.
public class Parent{ private Parent mySpouser; private ArrayList<Child> myChildren;}public class Child{ private Parent myMom; private Parent myDad;}
If C ++ is used for implementation, how can we avoid idling pointers and Memory leakage? Using shared_ptr to convert a bare pointer to a value semantics, we don't have to worry about these two problems:
class Parent;typedef boost::shared_ptr<Parent> ParentPtr;class Child : boost::noncopyable{ public: explicit Child(const ParentPtr& myMom_, const ParentPtr& myDad_) : myMom(myMom_), myDad(myDad_) { } private: boost::weak_ptr<Parent> myMom; boost::weak_ptr<Parent> myDad;};typedef boost::shared_ptr<Child> ChildPtr;class Parent : boost::noncopyable{ public: Parent() { } void setSpouser(const ParentPtr& spouser) { mySpouser = spouser; } void addChild(const ChildPtr& child) { myChildren.push_back(child); } private: boost::weak_ptr<Parent> mySpouser; std::vector<ChildPtr> myChildren;};int main(){ ParentPtr mom(new Parent); ParentPtr dad(new Parent); mom->setSpouser(dad); dad->setSpouser(mom); { ChildPtr child(new Child(mom, dad)); mom->addChild(child); dad->addChild(child); } { ChildPtr child(new Child(mom, dad)); mom->addChild(child); dad->addChild(child); }}
If you do not use smart pointer, it will be difficult to use C ++ for object-oriented programming.
Value semantics and standard library
C ++ requires that all types that can be put into the standard container have the value semantics. To be accurate, the type must be the SGIAssignable concept model. However, the C ++ compiler will provide the copy constructor and assignment operator for the class by default, unless explicitly prohibited, otherwise, the class can always be used as the element type of the standard library. Although the program can be compiled, the bug in resource management is hidden.
Therefore, when writing a class, let it inherit boost: noncopyable, which is almost always correct.
In modern C ++, you do not need to write copy constructor or assignment operator by yourself, because as long as each data member has a value semantics, the automatically generated member-wise copying & assigning by the compiler works normally. If you use smart ptr as a member to hold other objects, copying & assigning can be automatically enabled or disabled. Exception: You still need to implement copy control when writing underlying libraries such as HashMap.
Value semantics and C ++ Language
The class of C ++ is essentially a value semantics, which leads to the unique problem of object slicing, the programmer needs to pay attention to the trade-off between pass-by-value and pass-by-const-reference. In other object-oriented programming languages, this does not require a headache.
Value semantics is one of the three constraints of the C ++ language. The original design of C ++ is to enable user-defined classes to work like built-in types (int, has the same status. Therefore, C ++ has made the following design (compromise ):
- Like C struct, class layout has no additional overhead. The object overhead of defining a class containing only one int member is the same as defining an int.
- Even the default value of class data member is uninitialized, because the int of the function is uninitialized.
- You can create a class on the stack or heap. Because int can be stack variable.
- Class arrays are class objects, with no additional indirection. This is because the int array is like this.
- The compiler generates copy constructor and assignment operator for the class by default. No copy constructor is provided in other languages, and assignment operator cannot be overloaded. C ++ objects can be copied by default, which is an embarrassing feature.
- When class type is passed into the function, the default value is make a copy (unless the parameter is declared as reference ). This is because the int value is used to input a copy into the function.
- When the function returns a class type, you can only use make a copy (C ++ has to define RVO to solve the performance problem ). Because the int value returned by the function is make a copy.
- When the class type is a member, the data member is embedded. For example, pair <complex <double> and layout of size_t> is complex <double> next to size_t.
These designs bring performance benefits because of memory locality. For example, we define complex <double> class, array of complex <double>, vector <complex <double> in C ++, their layout is: (re and im are real and virtual parts of the plural .)
If we do the same thing in Java, layout is very different, and memory locality is also much worse:
Each object in Java has a header and at least two word overhead. Compared with Java and C ++, the object model of C ++ is much more compact.
To be continued
In the next article, I will talk about the data abstraction action closely related to the value semantics to explain why it is a programming paradigm parallel to object-oriented, why does OSS support object-oriented programming languages not necessarily support data abstraction. C ++ was initially selling data into action, but as time passes, it seems that many people only know Object-Oriented and do not know data into action. The strength of C ++ is that "abstraction" is not at the cost of performance loss. We will see a specific example in the next article.