[Go] dig into the STL string for Linux GCC 4.4

Last Update:2015-07-31 Source: Internet

Author: User

Tags assert

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper analyzes the operating mechanism of standard Template block library std::string in C + + by studying STL source code, and focuses on the reference counting and copy-on-write technology.

Platform: X86_64-redhat-linux
GCC version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)

1. Questions raised

Recently, in our project, there have been two problems related to the use of string.

1.1. Issue 1: Bug introduced by new code

Some time ago there was an old project to come up with a new requirement, and we added some code logic to deal with this new requirement. There is no problem with the test phase, but once on-line, it occasionally causes incorrect logic output or even crashes. This problem has plagued us for a long time. We did not find any problems with the new code in the Unit test and the integration test, and finally forced us to look at the large segment of the original code that did not change the logic. The project often encounters the use of string, and the logic in the original code raises our suspicions:

     string String_info;     //... Assignment to String_info     char* p = (char*) string_info.data ();

After rigorous checks and logical judgments, some logical branches make some modifications to what p points to. This is dangerous, but it works fine. Lenovo to our recent changes: Copy the String_info string object and do some processing. We realized that the Copy-on-write and reference counting techniques of string could lead us to copy this string without actually implementing a copy of the data. After doing some testing and research, we are convinced of this. The above code has been modified to handle the following:

char* p = & (String_info[0]);

Then the project similar to the place have done such a deal, testing, on-line, everything OK, too perfect.

1.2. Issue 2: Performance optimization

A recent refactoring of the project, profiling the related code, found that the memcpy CPU accounted for a high of 8.7%, carefully examining the code and discovering a large number of map lookups for existing code. The map is defined as follows:

     typedef std::map SSMAP;     Ssmap Info_map;

Find the following actions:

     info_map["Some_key"] = some_value;

We will inadvertently write the above code, if you change the following code, performance will be much better:

     static const std::string __s_some_key   = "Some_key";     Info_map[__s_some_key] = some_value;

This is because the first code, which constructs a temporary string object each time it is found, also copies a copy of the string "Some_key". The modified code is only constructed once at the first initialization time and will not be copied for each subsequent call, so it is much more efficient. After the code has been optimized like this, memcpy's CPU is reduced to 4.3%.

Below, we will explain the solution process and ideas of the above two problems by deep inside the source code of string.

2. std::string definition

The string classes in the STL are defined as follows:

Template<typename _chart, TypeName _traits, TypeName _alloc> class Basic_string;typedef basic_string <char, Cha R_traits<char, allocator< char> > string;

It is not difficult to find that the string occupies only one pointer (_chart* _m_p) of size space on the stack memory space, so sizeof (string) ==8. Other information is stored on the heap memory space.

Question 1:
We have one of the following C + + statements:

String name;

Excuse me, what is the total memory cost of the variable name? We'll answer the question later.

3. std::string Memory Space layout

Let's look at the internal memory space layout of a string object using common usage.
The most common string usage is to construct a string object from a C-style string, for example:
String name ("Zieckey");

The constructor for its invocation is defined as follows:

basic_string (const _chart* __s, const _alloc& __a): _m_dataplus (_s_construct (__s, __s? __s + traits_type:: Length ( __s):              __s + NPOs, __a), __a) {}

The constructor calls _s_construct directly to construct the object, which is defined as follows:

Template<typename _chart, TypeName _traits, TypeName _alloc>template<typename _initerator>_chart*basic_             String<_chart, _traits, _alloc>::_s_construct (_initerator __beg, _initerator __end, const _Alloc& __a,    Input_iterator_tag) {//Avoid reallocation for common case.    _chart __buf[128];    Size_type __len = 0;        while (__beg! = __end && __len < sizeof (__BUF)/sizeof (_chart)) {__buf[__len + +] = *__beg;    + + __beg;    }//Construct a _REP structure and allocate enough space at the same time, see the following memory image diagram _rep* __r = _rep:: _s_create (__len, Size_type (0), __a);    Copy data to a string object inside _m_copy (__r->_m_refdata (), __buf, __len);                __try {while (__beg! = __end) {if (__len = = __r-> _m_capacity) {                Allocate more space.                _rep* __another = _rep:: _s_create (__len + 1, __len, __a);                _m_copy (__another->_m_refdata (), __r->_m_refdata (), __len); __r->_m_destroy (__a);            __r = __another;            } __r->_m_refdata () [__len++] = * __BEG;        + + __beg;    }} __catch (...)        {__r->_m_destroy (__a);    __throw_exception_again;    }//Set string length, reference count, and assignment the last byte is the trailing character char_type () __r-> _m_set_length_and_sharable (__len); Finally, return the address of the first character of the string return __r->_m_refdata ();} Template<typename _chart, TypeName _traits, TypeName _alloc>typename basic_string <_chart, _traits, _Alloc >          :: _rep*basic_string<_chart, _traits, _alloc>::_rep:: _s_create (Size_type __capacity, Size_type __old_capacity,  Const _ALLOC & __alloc) {//The space to be allocated includes://an array char_type[__capacity]//An extra end character char_type ()// One is enough to hold the struct _REP space//Whew.    Seemingly so needy, yet so elemental.    Size_type __size = (__capacity + 1) * sizeof (_chart) + sizeof (_REP); void* __place = _raw_bytes_alloc (__alloc). Allocate (__size); Application Space _rep * __p = new (__place) _rep;//The direct new object (called placement new) in Address __place space __p-> _m_capacity = __capacity; __p-> _m_set_sharable ();//Set reference count to 0, indicating that the object is all return __p for itself;}

_rep is defined as follows:

struct _rep_base{    size_type               _m_length;    Size_type               _m_capacity;    _atomic_word            _m_refcount;};

At this point, we can answer the questions raised in question 1 above:
Above, "string name;" The total space occupied by this name object is 33 bytes, as follows:

sizeof (std::string) + 0 + sizeof (") + sizeof (STD::STRING::_REP)

Where: sizeof (std::string) is the stack space

The other C + + statement mentioned above, string name ("Zieckey"); Defines a string variable name whose memory space is laid out as follows:

4. In-depth string internal source code 4.1. String copy and strncpy

For a long time, it is often seen that there is a comparison and discussion of the efficiency between std::string assignment copy and strncpy. Let's use the test case to do a basic test:

#include <iostream> #include <cstdlib> #include <string> #include <ctime> #include <cstring >using namespace Std;const int array_size = 200;const int loop_count = 1000000;void test_strncpy () {    char S1[array_ Size];    char* s2= new char[array_size];    memset (S2, ' C ', array_size);    size_t Start=clock ();    for (int i =0;i!= loop_count;++i) strncpy (S1,S2, array_size);    cout<< __func__ << ":" << clock ()-Start<<endl;    Delete S2;    s2 = NULL;} void Test_string_copy () {    string S1;    string S2;    S2. Append (array_size, ' C ');    size_t Start=clock ();    for (int i =0;i!= loop_count;++i) s1= S2;    cout<< __func__ << ":" << clock ()-Start<<endl;} int main () {    test_strncpy ();    Test_string_copy ();    return 0;}

Compile with g++-o3, with the following running times:

test_strncpy:40000
test_string_copy:10000

The string strncpy runs 4 times times as long as string copy. The reason for this is that string copy is based on the reference counting technique, and the cost of each copy is very small.
In the test we also found that if the Array_size within 10 bytes, the difference between the two is not small, with the array_size, the gap between the two is increasing. For example, when array_size=1000, the strncpy will be 13 times times slower.

4.2. View reference count changes through GDB debugging

The test results above are very good, which eliminates the concern about string performance issues. Here we pass a program to verify the change and effect of the reference count in this process.
Please look at the test code first:

#include <assert.h> #include <iostream> #include <string>using namespace Std;int main () {string A = "0    123456789abcdef ";    String B = A;    cout << "a.data () =" << (void *) A. Data () << Endl;    cout << "b.data () =" << (void *) b. Data () << Endl;    ASSERT (a.data () = = B. data ());    cout << Endl;    string C = A;    cout << "a.data () =" << (void *) A. Data () << Endl;    cout << "b.data () =" << (void *) b. Data () << Endl;    cout << "c.data () =" << (void *) c. Data () << Endl;    ASSERT (a.data () = = c. data ());    cout << Endl;    C[0] = ' 1 ';    cout << "after write:\n";    cout << "a.data () =" << (void *) A. Data () << Endl;    cout << "b.data () =" << (void *) b. Data () << Endl;    cout << "c.data () =" << (void *) c. Data () << Endl;  ASSERT (A.data ()! = C. Data () && A. Data () = = B.data ());  return 0;}

After running, output:

A.data () =0xc22028
B.data () =0xc22028

A.data () =0xc22028
B.data () =0xc22028
C.data () =0xc22028

After write:
A.data () =0xc22028
B.data () =0xc22028
C.data () =0xc22068

The result output from the above code operation reflects the same memory address of the internal data of a, B, and C three string objects after we assign the values to B and C. How does the memory address of the internal data of the C object be different when we modify the C object?

We use GDB debugging to verify that the reference count changes during the execution of the above code:

(GDB) b 10Breakpoint 1 at 0x400c35:file string_copy1.cc, line 10. (GDB) b 16Breakpoint 2 at 0x400d24:file string_copy1.cc, line 16. (GDB) b 23Breakpoint 3 at 0x400e55:file string_copy1.cc, line 23. (GDB) rstarting program: [...] /unixstudycode/string_copy/string_copy1[thread debugging using libthread_db Enabled]breakpoint 1, Main () at String_ copy1.cc:1010          String b = A; (gdb) X/16ub a._m_dataplus._m_p-8       0x602020:       0       0       0       0       0       0       0       00x602028:      55

At this point the reference count for object A is 0

(GDB) n                                 -          cout &lt;&lt; "A.data () =" &lt;&lt; (void*) A.data () &lt;&lt; Endl

B=a assigns a value to b,string copy

(GDB) X/16ub a._m_dataplus._m_p-80x602020:       1       0       0       0       0       0       0       00x602028:      55.

At this point, the reference count of object A becomes 1, indicating that another object shares the object a

(GDB) Ccontinuing.a.data () =0x602028b.data () =0x602028breakpoint 2, Main () at string_copy1.cc:1616          string c = A; (GD b) X/16ub a._m_dataplus._m_p-80x602020:       1       0       0       0       0       0       0       00x602028:       48      [gdb] n17          cout &lt;&lt; "A.data () =" &lt;&lt; (void*) A.data () &lt;&lt; Endl

C=a assigns a value to c,string copy

(GDB) X/16ub a._m_dataplus._m_p-80x602020:       2       0       0       0       0       0       0       00x602028:      55.

At this point, the reference count of object A becomes 2, indicating that there are 2 other objects that share the object a

(GDB) Ccontinuing.a.data () =0x602028b.data () =0x602028c.data () =0x602028breakpoint 3, Main () at string_copy1.cc:2323< C17/>c[0] = ' 1 ';(gdb) N24          cout &lt;&lt; "After write:\n";

Modify the value of C

(GDB) X/16ub a._m_dataplus._m_p-80x602020:       1       0       0       0       0       0       0       00x602028:      55 off

The reference count of object A becomes 1 at this point

(GDB) p a._m_dataplus._m_p $       = 0x602028 "0123456789abcdef" (gdb) P b._m_dataplus._m_p$4 = 0x602028 "0123456789abcdef "(GDB) P c._m_dataplus._m_p$5 = 0x602068" 1123456789abcdef "

At this point, the internal data memory address of object C has been different from a, B, that is Copy-on-write

The GDB debugging process above clearly verifies that 3 string objects a B c are linked by reference counting techniques.

4.3. Source Analysis String Copy

Below we read the source code to analyze. The above process.
First look at the source code of the string copy process:

Copy constructor basic_string (const basic_string& __STR): _m_dataplus (__str._m_rep ()->_m_grab (_alloc (__STR) get_ Allocator ()),              __str.get_allocator ()),              __str.get_allocator ()) {}_chart* _m_grab (const _alloc& __alloc1, Const _alloc& __ALLOC2) {    return (! _m_is_leaked () && __alloc1 = = __alloc2)        ? _m_refcopy (): _m_clone ( __ALLOC1);} _chart*_m_refcopy () throw () {#ifndef _glibcxx_fully_dynamic_string    if (__builtin_expect (This! = &_s_empty_ Rep (), false)) #endif        __gnu_cxx::__atomic_add_dispatch (&this-> _m_refcount, 1);    return _m_refdata ();}

The above sections of the source code is better understood, successively called the basic_string (const basic_string& __str) copy constructor, _m_grab, _m_refcopy,
_m_refcopy actually calls the atomic Operation __atomic_add_dispatch (which ensures thread safety) to reference the Count +1, and then returns the data address of the original object.
From this you can see that the copy/assignment between the string objects is very, very small.

After several assignment statements, the memory space layout for the A, B, and C objects is as follows:

4.4. Copy-on-write

See "c[0] = ' 1 '; "What did you do:

Reference operator [] (Size_type __pos) {_m_leak (); return _m_data () [__pos];} void _m_leak ()//for use in Begin () & Non-const op[]{//Front See C object at this point actually the data of the A object actually points to the same memory area//therefore calls _m_leak_ Hard () if (! _m_rep ()->_m_is_leaked ()) _m_leak_hard ();}    void _m_leak_hard () {if (_m_rep ()->_m_is_shared ()) _m_mutate (0, 0, 0); _m_rep () _m_set_leaked ();} void _m_mutate (Size_type __pos, Size_type __len1, Size_type __len2) {const Size_type __old_size = this-> size () ;//16 const Size_type __new_size = __old_size + __len2-__len1; -Const Size_type __how_much = __old_size-__pos-__len1; if (__new_size > this, Capacity (), _m_rep ()->_m_is_shared ()) {//Reconstruct an object const        Allocator_type __a = Get_allocator ();        _rep * __r = _rep:: _s_create (__new_size, this-> Capacity (), __a); Then copy the data if (__pos) _m_copy (__r-_m_refdata (), _m_data (), __pos ); if (__how_much) _m_copy (__r-_m_refdata () + __pos + __len2, _m_data () + __pos + __len1, __h        Ow_much);        Reduce the reference count on the original object by _m_rep ()->_m_dispose (__a);    Binds to the new object _m_data (__r-_m_refdata ());        } else if (__how_much && __len1! = __len2) {//Work in-place.    _m_move (_m_data () + __pos + __len2, _m_data () + __pos + __len1, __how_much); }//finally set the length and reference count value of the new Object _m_rep ()-_m_set_length_and_sharable (__new_size);}

The above source code is slightly more complex, the process of modifying C is divided into the following two steps:

The first step is to determine whether it is a shared object, (the reference count is greater than 0), and if it is a shared object, copy a new copy of the data while reducing the reference count value of the old data by 1.
Step Two: Modify the new address space to avoid data contamination of other objects

It can be seen that it is potentially unsafe and destructive if the string object is not forced to be modified by the interface provided by string. For example:

char* p = const_cast<char*> (S1.data ());p [0] = ' a ';

The above code is modified for C ("c[0] = ' 1 ';") After that, the memory space layout of a B C object is as follows:

The benefits of Copy-on-write through the above analysis are obvious, but also bring some side effects. For example the above code fragment "c[0" = ' 1 '; "If it is through external coercion, it may bring unexpected results." Take a look at the following code:

char* pc = Const_cast (C.c_str ());p c[0] = ' 1 ';

This code is more efficient than operator[] by forcing changes to the values of the internal data of the C object, but also modifies the values of the A and B objects, which may not be what we want to see. This is the place where we need to be more vigilant.

5. Examples of inappropriate use of string

Our project team has a distributed memory kv system, generally MD5 do key,value is any binary number. At the time of design, considering the memory capacity is always limited, did not choose to use string, but the key structure and value structure developed separately. Here is the definition of the key structure we designed:

struct key{    uint64_t low;    uint64_t High;};

The structure requires a memory size of 16 bytes, which preserves the binary 16-byte MD5. Save 33 (refer to the string memory space layout above) bytes as opposed to string as a key. For example, now we have a project that is using a distributed cluster of the system, a total of 10 billion records, each of which saves 33 bytes, a total of memory space: 33*100 billion =330g. This shows that only a small improvement of key, it can save such a large amount of memory, or is very worthwhile.

6. Compare the STL versions provided by Microsoft Visual Studio

The string implementations of vc6.0 are based on reference counts, but are not thread-safe. However, the reference counting technique is removed from the subsequent versions of the VC, and string copy directly makes a deep memory copy.
The migration of cross-platform programs poses a potential risk due to inconsistent details on string implementations. On such occasions, we need extra attention.

7. Summary

Even if it is an empty string object that occupies 33 bytes of memory space, consider using string carefully when memory usage requires more rigorous scenarios such as memcached.
String because of the use of reference counting and copy-on-write techniques, the performance gains relative to strcpy,string copy are significant.
When a reference count is used, multiple strings point to the same chunk of memory, so if you force modify the contents of one string, the other string is affected.

[Go] dig into the STL string for Linux GCC 4.4

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More