A common interview question in c ++ allows you to implement a String class, which is limited by time. It is impossible to have the std: string function, but at least requires proper resource management. Specifically:
Understanding the string class
Before studying what is wrong with the string class, let me first talk about how to understand a C ++ class. We need to understand a C ++ class. Generally, we need to start from three aspects.
1. Intention ). What is the intention of the string class? Only by understanding the intention can we understand its ideas. This is the most important and fundamental part of understanding a thing. Otherwise, you will find that its behavior is not as expected. The string class has two meanings. The first is to process arrays of the char type and encapsulate some string processing functions in standard C. When the string class enters the C ++ standard, its second meaning is a container. There is no conflict between the two things. To understand the string mechanism, we need to consider these two aspects.
II. Specification (Specification ). We noticed that the string class has too many interface functions. This is the most important topic in the C ++ camp. A small string class actually has 106 member interface functions. How can the C ++ standards committee tolerate the occurrence of such "ugly? Currently, there are two mainstream reasons that lead to the "C ++ standards committee brain flooding", one for efficiency improvement and the other for common operations.
1) Let's first look at the efficiency, and look at the "=" operator in the string class to reload the interface:
The code is as follows: |
Copy code |
Bool operator = (const string & lhs, const string & rhs ); Bool operator = (const string & lhs, const char * rhs ); Bool operator = (const char * lhs, const string & rhs );
|
The first one is very standard, and the last two seem unnecessary. If we call :( Str = "string") without the following two interfaces, the string constructor converts the "string" of char * into a string object, and then calls the first interface, that is, operator = (str, string ("string ")). This "redundant" design can only be said to pursue efficiency, to save the time for calling constructor/destructor and allocating/releasing memory (which saves a lot of time ). In the next two interfaces, the strcmp function of C is used directly. It seems that this design is still necessary. There are many algorithms and designs for efficiency in the string class, such as: Copy-on-Write (see my "standard C ++ class string Copy-On-Write Technology. These things make our strings very efficient, but they also bring traps. If you do not know these things, unexpected problems occur when you use them, which will make you feel confused and overwhelmed.
2) another reason for the string class to have such a large interface is common operations. For example, substr () of the string class, which is a function for intercepting sub-strings. In fact, this function is not required, because the string has a constructor that can specify its start and length from other string classes to construct itself, so as to implement this function. There is also the copy () function, which is also an unnecessary function. The copy function copies the content of the string class to a memory buffer. This method has proved to be rarely used. Possible: 1) for the sake of security, such a member needs to copy the content; 2) designers think that copy is much more beautiful than strcpy or memcpy. Copy () is unnecessary than substr.
3. Implementation ). The C ++ standard is not implemented with too many interventions. Different vendors have different implementations. Different vendors will consider whether one thing in the standard meets the needs of the market, and whether their compiler has the ability to compile the standard. As a result, they will always slightly or subvert the standard. The differences between C ++ in compilers are painful and desperate. If you do not know the specific implementation, when you use C ++, you will also find that it does not work as you think.
Only from the above three aspects can you really understand a C ++ class, and you can also make good use of C ++. C ++ experts have analyzed various types of C ++ from these three aspects, and verified the design of C ++ class.
Variables can be defined like int type, and values can be assigned and copied.
The parameter types and return types that can be used as functions.
The element type that can be used as a standard library container, that is, the value_type of vector/list/deque. (The key_type used as std: map is a further requirement. This article will be omitted ).
In other words, your String can compile and run the following code without memory errors.
The code is as follows: |
Copy code |
Void foo (String x) { } Void bar (const String & x) { } String baz () { String ret ("world "); Return ret; } Int main () { String s0; String s1 ("hello "); String s2 (s0 ); String s3 = s1; S2 = s1; Foo (s1 ); Bar (s1 ); Foo ("temporary "); Bar ("temporary "); String s4 = baz (); Std: vector <String> svec; Svec. push_back (s0 ); Svec. push_back (s1 ); Svec. push_back (baz ()); Svec. push_back ("good job "); } |
This article provides the answer that I think is suitable for the interview, emphasizing correctness and ease of implementation (writing on the whiteboard will not be wrong), without emphasizing efficiency. In a sense, it can be said that it is time (running speed) for space (simple code ).
Select a data Member. The simplest String contains only one char * Member variable. The advantage is that it is easy to implement. The disadvantage is that some operations are more complex (for example, size () may be linear time ). To avoid code writing errors during the interview, the String designed in this article only has one char * data _ member. It is also stipulated that invariant is as follows: the data _ of a valid string object is not NULL, and the data _ ends with ''to facilitate the combination of str * () series functions in C language.
Next, you must decide which operations are supported, including construction, destructor, copy construction, and assignment (formerly called big three, now called copy control ). If you drill deeper, the C ++ 11 mobile construction and mobile assignment can also be. In order to highlight the key points, this article does not consider heavy loads such as operator.
In this way, the code is basically finalized:
The code is as follows: |
Copy code |
# Include <utility> # Include <string. h> Class String { Public: String () : Data _ (new char [1]) { * Data _ = ''; } String (const char * str) : Data _ (new char [strlen (str) + 1]) { Strcpy (data _, str ); } String (const String & rhs) : Data _ (new char [rhs. size () + 1]) { Strcpy (data _, rhs. c_str ()); } /* Delegate constructor in C ++ 11 String (const String & rhs) : String (rhs. data _) { } */ ~ String () { Delete [] data _; } /* Traditional: String & operator = (const String & rhs) { String tmp (rhs ); Swap (tmp ); Return * this; } */ String & operator = (String rhs) // yes, pass-by-value { Swap (rhs ); Return * this; } // C ++ 11 String (String & rhs) : Data _ (rhs. data _) { Rhs. data _ = nullptr; } String & operator = (String & rhs) { Swap (rhs ); Return * this; } // Accessors Size_t size () const { Return strlen (data _); } Const char * c_str () const { Return data _; } Void swap (String & rhs) { Std: swap (data _, rhs. data _); } Private: Char * data _; }; |
Note the following code points:
Only new char [] is called in the constructor, and delete [] is called only in the destructor.
The value assignment operator adopts the modern writing method recommended by C ++ programming specifications.
Each function has only one or two lines of code and has no conditions for judgment.
The Destructor does not have to check whether data _ is NULL.
Const char * str does not check the validity of str, which is a never-ending topic. Str is used in the initialization list, so it is meaningless to use assert () in the function body.
This is probably the most concise String implementation.
Exercise 1: Add operator =, operator <, operator [] and other operator overloading.
Exercise 2: implement a version with an int size _; member, and change the time by space.
Exercise 3: Benefiting from right-value reference and moving semantics, the performance of direct insertion and sorting of strings in C ++ 11 is higher than that in C ++ 98/03. It is verified by trial programming. (The standard library of g ++ also uses this technology .)