C/C ++ string processing (2): String-constant string

Source: Internet
Author: User
C/C ++ string processing (2): String-constant string


Xu Shiwei
2008-3-23
Unfold

Table of Contents brief understanding of string (basicstring) for the source code of the tempstring base class, refer
Summary

We know that the C ++ standard library (STL) provides the string (basic_string) class for string operations. The character string is likely to be in addition to the memory distributor (Allocator)1. STL classes that are most frequently used. However, the C ++ community never stops blaming string.

To sum up, the string class of STL has the following controversial points:

  • There are too many interfaces and the specifications are not consistent with those of other STL containers. For example, string: Find uses subscript instead of iterator as the iteration position, which is different from other containers.
  • Memory fragments. The memory fragmentation of the system is serious due to frequent string construction and analysis.
  • Copy-on-write and multi-thread security. String (basic_string) is based on the copy-on-write technology because the value assignment of string is designed to be low overhead. However, considering the security of multiple threads, copy-on-write will spend a lot of time on the lock overhead. Some new STL implementations (such as sgi stl) discard the string implementation based on copy-on-write.

I agree with these accusations. The best string design is to split the string into a regular string (STD: string) and a string operation class (stringbuilder ). Our stdext library does this.

Understanding string (basicstring)

The string (basicstring) of stdext is not the same as all the string classes you have seen before. This analogy is simpler than you think. It has only two member variables:

template <class _E>
class BasicString
{
const _E* m_pszBuf;
size_t m_length;
};

It is different from string (basic_string) in that:

  • It is a regular string and never tries to tamper with the string content (m_pszbuf points to the data ).
  • It has no structure. You can think it is actually a struct. Of course, for convenience, there are still constructors for basicstring.
  • Its m_pszbuf does not end with nil. The m_length member limits the length of the string.
  • It does not maintain the lifecycle of the string content (m_pszbuf. As mentioned above, it has no structure. At any time, it only accepts or generates string content, but is not responsible for destroying it.

The last point is very important. It is also special: it does not maintain the string lifecycle. This may surprise you: there is such a string class that does not manage the lifecycle of a string.

But we did. This does bring us a lot of convenience. For example:

  • Assignment (copy) and substring (substr) are very lightweight operations. The copy-on-write technology is completely redundant.
  • You can temporarily convert any linear container (such as STD: vector, STD: basic_string) to string (very lightweight ). See the following introduction to the string: cast method.

Why can the string class not manage its own lifecycle? This is the idea we advocate for the memory management revolution of stdext.

When you browse the reference manual for the string class, you notice that there are two constructors:

BasicString(const value_type* pszVal, size_type cch);

template <class AllocT>
BasicString(AllocT& alloc, const value_type* pszVal, size_type cch);

This indicates that the lifecycle of the pszval passed in by the first constructor is longer than that of basicstring (which is still valid when basicstring is parsed ). The second constructor means that pszval is a temporary Valid String, which will copy a copy of The pszval string.

Why is the construction like basicstring (const value_type * pszval) not supported?

This structure is too dangerous. I cannot determine what your intention is.

Tempstring base class

Literally, this is a temporary string class. Why is it a base class of string (basicstring? This is actually just an implementation requirement. In theory, tempstring is a string (only with a special lifecycle), which is consistent with the basicstring specification. It finally becomes the basicstring base class, which is easy to implement.

Taking basicstring: Compare as an example, we examine the following function:

int BasicString::compare(const TempString<_E> b) const; 

This function has rich meanings. This is equivalent to defining the following series of functions:

Int basicstring: Compare (const _ E & B) const; // compare with a string containing a single character B
Int basicstring: Compare (const _ E * B) const; // compares it with string B in the C style.
Int basicstring: Compare (const basic_string <_ E> & B) const; // compare it with STL string
Int basicstring: Compare (const basicstring <_ E> & B) const; // compare it with another regular string
Int basicstring: Compare (const vector <_ E> & B) const; // compare with string B represented by the vector
Int basicstring: Compare (const basicstringbuilder <_ E> & B) const;

One function can accommodate six functions!

As you can see, tempstring is widely used in basicstring for specification definition. Code scalability in this way is undoubtedly quite good. Each time a linear string is added to the Construction of tempstring, all related operations of basicstring can immediately support string representation of this type.

To further illustrate the benefits of this practice, we will use Concat as an example:

Template <class alloct, class _ E> // multiple strings are connected.
Basicstring <_ E> Concat (alloct & alloc, tempstring <_ E> A1, tempstring <_ E> A2 ,...);

Concat is not a member function of the basicstring class. It is a global function closely related to basicstring. For the STL string class, we recommend using the operator + or string: append function for string connection. For example:

std::string a = std::string("Hello") + " " + "world" + "!!!"; 

Correspondingly, basicstring does not have operator + or append. It uses the Global STD: Concat function to connect strings. As follows:

std::String a = std::concat(alloc, "Hello", " ", "world", "!!!"); 

Interestingly, this STD: Concat is not only acceptable.EfficientlyConnect any number of strings, and it can alsoEfficient connection to various linear string representationsIncluding: char *, STD: String, STD: vector <char>, STD: String, STD: stringbuilder, etc. For example:

std::string hello = "Hello";
std::String space(" ", 1);
std::vector<char> excalmatory_mark(3, '!');
std::String a = std::concat(alloc, hello, space, "world", excalmatory_mark);

All of these mysteries are found in tempstring. For more information about STD: tempstring, see tempstring.

Source code
  • Stdext/text/tempstring. h
  • Stdext/text/basicstring. h

Engineering home page for stdext Library: http://code.google.com/p/stdext/

Read more
  • String
  • Tempstring
  • Concat
  • STD: String
The memory distributor (Allocator) of footnotes1. C ++ is a bit strange. Although it is used everywhere, most programmers are unfamiliar with it. For more information, see C ++ memory management.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.