"Reprint" STL "source code" analysis-Key Knowledge summary

Source: Internet
Author: User
Tags imap

Original:STL "source code" analysis-Key Knowledge summary

STL is one of the important components of C + +, the university saw the "STL Source Code Analysis" This book, these days reviewed a bit, summed up the following LZ think more important knowledge points, content a little bit more:)

1. STL Overview

STL provides six components that can be combined with each other:

    • Container (Containers): A variety of data structures, such as: vector, List, deque, set, map. Used to store data. From an implementation point of view, the STL container is a class template.
    • Algorithm (algorithms): A variety of common algorithms, such as: Sort, search, copy, erase. From the implementation point of view, the STL algorithm is a function template.
    • Iterator (iterators): A binder between a container and an algorithm, which is called a "generic pointer". There are five types, as well as other derivative changes. From an implementation perspective, an iterator is a class template that Overloads operator*, Operator->, operator++, operator--and other pointer-related operations. All STL containers have their own unique iterators, and only the container itself knows how to traverse its own elements. The native pointer (native pointer) is also an iterator.
    • Functor (functors): behaves like a function, which can be used as a strategy for an algorithm. From an implementation point of view, the functor is a class or class template that overloads the operator (). The general function pointer can also be considered as a narrow-sense imitation function.
    • Adapter (adapters): something that modifies a container, an imitation function, an iterator interface. For example: The queue and stack provided by STL, although seemingly containers, can only be considered as a container adapter because the bottom of them is fully deque, and all operations are supplied by the underlying deque. Change functors interface, called function adapter, change container interface, called container adapter, change iterator interface, called iterator adapter.
    • Configurator (allocators): Responsible for space configuration and management. From the point of view of implementation, Configurator is a class template which realizes dynamic space configuration, space management and space release.

STL the interaction of six major components

Some of the possible confusing C + + syntax sugars:

    1. Static constant integer member (double is not possible) to initialize directly within class
    2. Static members can only be initialized outside of the class and are not static when initialized
    3. The base class is enough to call the virtual function in the constructor actually calls the virtual function in the base class (this is different from Java)
    4. Any STL algorithm needs to get the interval indicated by a pair of iterators (generic pointers) to represent the scope of the operation. This pair of iterators represents a front-closed post-open interval

Generic pointers, native pointers, and smart pointers
    • Generic pointers have multiple meanings. Refers to the void* pointer, which can point to any data type and therefore has a "generic" meaning. Refers to a generic data structure with pointer attributes, including a generic iterator, a smart pointer, and so on. A generalized iterator is an opaque pointer that enables traversal access operations. The commonly referred iterators refer to the narrow-sense iterators, which are examples of classes based on generic-based iterator_traits implementations of C + + STL. Generally speaking, generic pointers and iterators are two different concepts, the intersection of which is the commonly mentioned iterator class.
    • A native pointer is a normal pointer, which is relative to the behavior of a pointer, but not a pointer. To say "native" means "the simplest and most basic". Because so many things are now abstractly theoretical, "the simplest and most basic pointers of the past" are just one manifestation of an abstract concept (such as iterator).
    • Smart pointers are the concept of C + +: Because the C + + language does not have an automatic memory recovery mechanism, programmers have to deal with memory-related problems each time, but using smart pointers can effectively alleviate such problems. The introduction of smart pointers to prevent the occurrence of dangling pointers, usually the pointer is encapsulated in a class called Smart pointer, this class also encapsulates a use counter, the copy of the pointer and so on will cause the value of the counter to add 1, the delete operation on the pointer will be reduced by 1, the value is 0 o'clock, the pointer is null

2. iterators

The main idea of STL is to separate the data containers from the algorithms, design each other independently, and then bring them together with the binder. The generics of containers and algorithms can be implemented with C + + class template and function template, and the binder of both is an iterator.

An iterator is a smart pointer

Rather than an iterator as a pointer, an iterator is a smart pointer that encapsulates the pointer in a layer that contains both the flexibility and power of the native pointer, plus a number of important features that make it more useful and more useful to use. Iterators have overloaded the pointers with some basic operations such as *,, + +, = =,! =, =, which have the ability to traverse complex data structures, depending on the data structure that is being traversed. Let's look at the following code for the "smart" iterator:

Template<typename t>  class Iterator  {public  :      iterator& operator++ ();      //...  Private:       T *m_ptr;  };  

For different data containers, the implementation of the member function operator++ in the above iterator class will vary, for example, the possible implementation of the array is as follows:

For the implementation of the array  template<typename t>  iterator& operator++ ()  {      ++m_ptr;      Retrun *this;  }

For a linked list, it will have a member function similar to next to get the next node, which might be implemented as follows:

For the implementation of the linked list  template<typename t>  iterator& operator++ ()  {     m_ptr = M_ptr->next ();//next ( ) is used to get the next node of the list      return *this;  }  

Iterator first to iterator point to the implementation details of the object has a very rich understanding, so iterator in order not to expose the information pointed to the object, simply will iterator implementation by the designers of each container to achieve. The STL gives the implementation of the iterator to the container, each of which defines the exclusive iterator internally in a nested manner. The interface of the various iterators is the same, but the internal implementation is not the same, which also directly embodies the concept of generic programming.

Iterator Use example
#include <iostream>  #include <vector>  #include <list>  #include <algorithm>  using namespace Std;int main (int argc, const char *argv[]) {    int arr[5] = {1, 2, 3, 4, 5};    Vector<int> IVec (arr, arr + 5);//define container vector      list <int> iList (arr, arr + 5);//define Container list      // Look for the number of shaping between the head and tail of the container IVec 3      vector<int>::iterator iter1 = Find (Ivec.begin (), Ivec.end (), 3);    if (Iter1 = = Ivec.end ())        cout << "3 not Found" << Endl;    else        cout << "3 found" << Endl;    Looking for shaping number between the head and tail of the container iList 4      list<int>::iterator iter2 = Find (Ilist.begin (), Ilist.end (), 4);    if (Iter2 = = Ilist.end ())        cout << "4 not Found" << Endl;    else        cout << "4 found" << Endl;    System ("pause");    return 0;}

As can be seen from the use of the above iterators, iterators are attached to specific containers, that is, different containers have different implementations of iterators, at the same time, we also see that for the algorithm find, as long as it passed to the different iterators, the different containers can be found operations. Through the needle-threading of the iterator, the algorithm is effectively used to access the different containers, which is also the design purpose of the iterative device.

3. Sequence Container

The so-called sequence container, where elements are sortable, but not necessarily orderly, C + + itself built a sequential container array,stl also provides vector, list, deque, stack, queue, priority-queue and other sequential containers. The stack and queue are technically classified as a mating adapter (adapter) because they are just deque.

Vector

The data structure used by vectors is very simple: linear continuous space. It points to a range that is currently in use in the configured contiguous space with two iterators, start and finish, and end_of_storage the end of the entire contiguous space (including the spare space) with the iterator.

Template <class T, class Alloc = Alloc>class vector {    ... protected:    iterator start;    Represents    iterator finish;    Iterator end_of_storage;    ...};

  Note : The so-called dynamic increase in size, not after the original space after the continuation of the new space (because there is no guarantee of space after the original space), but instead of the original size of twice times another allocation of a larger space, and then copy the original content, and then began to construct new elements after the original content, and release the original space. Therefore, any operation on the vector, once the space is reconfigured, all iterators pointing to the original vector are invalidated.

Vector analysis of the size of variables

There are 3 iterator fields (that is, pointer fields) in the vector class, so the size is at least 12 bytes.

Test environment: Win7 64-bit VS2013

Test code:

#include <iostream> #include <vector>using namespace Std;int main (void) {    vector<int> A (5, 0);//    cout << sizeof (a) << Endl;    cout << (int) (void *) &a << Endl;    cout << (int) (void *) &a[0] << Endl;    cout << (int) (void *) &a[a.size ()-1] << Endl;    cout << Endl;    cout << * ((int *) &a) << Endl;    cout << * (((int *) &a) + 1) << Endl;    cout << * (((int *) &a) + 2) << Endl;    cout << * (((int *) &a) + 3) << Endl;    cout << Endl;    cout << a.size () << Endl;    cout << a.capacity () << Endl;    System ("pause");    return 0;}

The test results show that the size of the vector at this time is 16 bytes, including start, Finish, End_of_storage members, the remaining 4 bytes temporarily do not know what it means ...:(

Test environment: Ubuntu12.04 codeblocks10.05

Test code:

#include <iostream> #include <vector>using namespace Std;int main (void) {    vector<int> A (5, 0);//    cout << sizeof (a) << Endl;    cout << (int) (void *) &a << Endl;    cout << (int) (void *) &a[0] << Endl;    cout << (int) (void *) &a[a.size ()-1] << Endl;    cout << Endl;    cout << * ((int *) &a) << Endl;    cout << * (((int *) &a) + 1) << Endl;    cout << * (((int *) &a) + 2) << Endl;    cout << * (((int *) &a) + 3) << Endl;    cout << Endl;    cout << a.size () << Endl;    cout << a.capacity () << Endl;    return 0;}

The test results show that the vector size is 12 bytes at this time, including start, Finish, End_of_storage members

Summary

The STL versions used by win and Ubuntu are not the same, and different STL uses different vector classes, with different container management methods.

List

The list becomes much more complex relative to the vector's continuous linear space, and its benefit is to either insert or delete an element and configure or delete an element space. For insertion or deletion of elements of any location, list is always constant time.

The list itself and the nodes are different structures that need to be designed separately. The following is the node structure of the STL list:

Template <class t>class __list_node {    typedef void* void_pointer;    Void_pointer prev;    Void_pointer Next;    T data;};

This is a doubly linked list

List data structure

SGI list is not only a doubly linked list, but also a circular doubly linked list. You can traverse the entire list with just one pointer.

Deque

One of the biggest differences between deque and vectors is that deque allows for insertion or removal of the head in constant time, and that Deque has no concept of capacity (capacity), since it is composed of segmented contiguous spaces that can be added to a new space at any time.

Deque is made up of a contiguous space, and once it is necessary to add new space at the front or end of the deque, a contiguous space is configured and threaded to the front or end of the entire deque. The biggest task of deque is to maintain the illusion of the whole continuum in the continuous space of these segments, and provide the random access interface, avoiding the reincarnation of "reconfigure, copy, release", at the cost of complex iterator structures.

Deque iterators

The iterator must first indicate where the segmented contiguous space is, and secondly it must be able to determine if it is already at the edge of the buffer, and if so, the next buffer must be skipped once the forward or backward, in order to be able to jump normally, the deque must always have control center.

Iterator structure:

Template <class T, class Ref, Class Ptr, size_t bufsiz>struct __deque_iterator {//not inherited std::iterator    //Keep iterators connected    t* cur;//The current element of the buffer referred to by this iterator    t* first;//the head of the buffer referred to by this iterator    t* last;//the tail of the buffer referred to by this iterator (with alternate space)    MAP_ pointer node; Point to control center    ...};

If the deque already contains 20 elements and the buffer size is 8, the memory layout is as follows:

Note: The deque initial state (without any elements) retains a buffer, so clear () returns to its original state after completion, and a buffer is retained as well.

Stack

Tack is an advanced back-out (first in the last Out,filo) data structure, which has only one exit. Stack allows you to add elements, remove elements, and get the topmost element. But beyond the top, there is no other way to access the stack's other elements, in other words, the stack does not allow traversal behavior. The stack defaults to deque as the underlying container.

Queue

The queue is a first-in-OUT,FIFO data structure that has two exits that allow adding elements, removing elements, adding elements from the bottom, and getting the topmost element. However, there is no other way to access other elements of the queue except that the bottom can be added and the topmost can be removed, in other words, the queue does not allow traversal behavior. The queue defaults to deque as the underlying container.

Heap

Heap does not belong to the STL container component, it is a behind-the-scenes hero, playing prority queue's assistant. The priority queue allows the user to push any element into the container in any order, but it must be fetched by the element with the highest precedence. The binary Max Heap has exactly the same characteristics as the underlying mechanism for the priority queue. Heap by default, a large heap is established .

Heap Test Cases:

#include <iostream> #include <queue> #include <algorithm>using namespace std;template <class t>    struct display{void operator () (const T &x) {cout << x << "";    The}};///heap defaults to a large heap, and the following settings are set to build a small heap of template <typename t>struct greator{bool Operator () (const T &x, const T &y)    {return x > y;    }};int Main (void) {int ia[9] = {0, 1, 2, 3, 4, 8, 9, 3, 5};    Vector<int> Ivec (IA, IA + 9); Make_heap (Ivec.begin (), Ivec.end (), greator<int> ());    Note: When this function is called, the new element should stop at the end of the bottom container for_each (Ivec.begin (), Ivec.end (), display<int> ());    cout << Endl;    Ivec.push_back (7);    Push_heap (Ivec.begin (), Ivec.end (), greator<int> ());    For_each (Ivec.begin (), Ivec.end (), display<int> ());    cout << Endl;    Pop_heap (Ivec.begin (), Ivec.end (), greator<int> ());    cout << ivec.back () << Endl;    Ivec.pop_back ();    For_each (Ivec.begin (), Ivec.end (), display<int> ()); COut << Endl;    Sort_heap (Ivec.begin (), Ivec.end (), greator<int> ());    For_each (Ivec.begin (), Ivec.end (), display<int> ());    cout << Endl;    System ("pause"); return 0;}
Priority_queue

Priority_queue is a weighted queue that allows the addition of new elements, the removal of old elements, and the ability to examine element values. Because it is a queue, it is only allowed to add elements at the bottom, remove elements from the top, and there is no other way to access the elements. The elements in the priority_queue are not arranged in the order in which they are pushed, but are automatically arranged according to the weights of the elements. The highest weights are in front of them.

By default, Priority_queue uses Max-heap, which is a complate binary tree with a vector as the underlying container.

Priority_queue Test Cases:

#include <iostream> #include <queue> #include <algorithm>using namespace Std;int main (void) {    int Ia[9] = {0, 1, 2, 3, 4, 8, 9, 3, 5};    Vector<int> Ivec (IA, IA + 9);    Priority_queue<int> IPQ (Ivec.begin (), Ivec.end ());    Ipq.push (7);    Ipq.push (+);    while (!ipq.empty ())    {        cout << ipq.top () << "";        Ipq.pop ();    }    cout << Endl;    System ("pause");    return 0;}

4. Associative containers

Both the set and map underlying data structures are red and black, and the data field segment of the Red and black tree is pair<key, value> type. For more information on red and black trees, click: deep understanding of red and black trees.

Set

All elements of set are automatically sorted based on the key value of the element. The set element does not have a real value (value) and a key value (key)as the map does, and the key value of the set element is the real value, the real value is the key value, and set does not allow two identical elements. Set elements can not be changed, in the set source code, Set<t>::iterator is defined as the underlying Tb-tree const_iterator, to eliminate the write operation, that is, set iterator is a kind of constant Iterators (relative to mutable iterators)

Test cases (let set from large to small store elements):

#include <iostream> #include <set> #include <functional>using namespace std;///set is arranged by default from small to large, Here is the set from large to small arrange template <typename t>struct greator{    bool operator (const T &x, const T &y)    {        return x > y;    }}; int main (void) {    set<int, greator<int>> Iset;    Iset.insert (n);    Iset.insert (1);    Iset.insert ();    for (Set<int>::const_iterator iter = Iset.begin (); ITER! = Iset.end (); iter++)    {        cout << *iter <& Lt " ";    }    cout << Endl;    System ("pause");    return 0;}
Map

All elements of the map are automatically sorted based on the key values of the elements. All elements of the map are pair, with both a real value (value) and a key value (key). The first element of the pair is the key value, and the second element is the real value. Map does not allow two identical key values.

If you change the key value of an element through a map iterator, this is not possible because the key value of the map element is related to the arrangement rules of the MAP element. Arbitrarily changing the map element's key values will destroy the map organization. This is possible if you modify the real value of an element, because the map element's real value does not affect the arrangement rules of the MAP element. Therefore, map iterator is neither a constant iterators nor a mutable iterators.

Test cases (map from large to small store elements):

#include <iostream> #include <string> #include <map> #include <functional>using namespace std;/ Map default is small to large arrangement, the following is to let map from large to small arrangement template <typename t>struct greator{    bool Operator (const T x, const T y)    { C3/>return x > y;    }}; int main (void) {    map<int, string, greator<int>> IMAP;    IMAP[3] = "333";    IMAP[1] = "333";    IMAP[2] = "333";    For (map<int, string>::const_iterator iter = Imap.begin (); ITER! = Imap.end (); iter++)    {        cout << it Er->first << ":" << iter->second << Endl;    }    System ("pause");    return 0;}
Multiset/multimap

The multiset features and usage and set are identical, the only difference being that it allows the key value to be duplicated, so its insert operation is based on the underlying mechanism Rb-tree insert_equal () rather than Insert_unique ().

The Multimap feature and usage are exactly the same as the map, except that it allows the key value to be duplicated, so its insert operation is based on the underlying mechanism Rb-tree insert_equal () rather than Insert_unique ().

Hashtable (underlying data structure)

The binary search tree has a logarithmic average time representation, but such a representation is constructed on the assumption that the input data is sufficiently random. Hashtable this structure has a "constant average time" in insertions, deletions, and lookups, and this performance is based on statistics and does not depend on the randomness of the elements.

The hashtable underlying data structure is a hash table separating the Join method, as follows:

The buckets in Hashtable uses a vector data structure that, when an element is inserted, finds the slot in which the buckets is inserted, and then traverses the linked list that the slot points to, and returns if there is the same element, or inserts the element into the head of the list. (Of course, if it is a multi version, it is possible to insert a repeating element, at which point the insertion process is: when inserting an element, find the slot to which the buckets is inserted, and then traverse the linked list that the slot points to, if there is the same element, insert the new node behind the same element If there are no identical elements, create a new node, insert into the list header)

When the member function clear () is called, the buckets vector does not free up space, still retains its original size, and simply deletes the linked list that the buckets is connected to.

Hash_multimap plug-in diagram description

Hash_set

Use set to quickly search for elements. This, regardless of whether the underlying is Rb-tree or Hashtable, can complete the task, but, Rb-tree has automatic sorting function and hashtable not, that is, the set of elements have automatic sorting function and Hash_set not.

Test code:

hash_set Test CodeHash_map

Hash_map with Hashtable as the underlying structure, because the HASH_MAP provides the operation interface, Hashtable is provided, so almost all hash_map operation behavior is to call Hashtable operation behavior results. Rb-tree has automatic sorting function and hashtable no, the result is that the map element has automatic sorting function and Hash_map not.

Test code:

hash_map Test CodeHash_multiset/hash_multimap

The Hash_multiset feature is exactly the same as the multiset, except that the underlying mechanism is hashtable, so the elements of Hash_multiset are not automatically sorted.

The Hash_multimap feature is exactly the same as the Multimap, except that the underlying mechanism is hashtable, so the elements of Hash_multimap are not automatically sorted.

Hash_multimap Test Cases:

#include <iostream> #include 

Run the result under vs2013 (Windows 7 64-bit) as:

Run in Kali2.0 (the program needs to add a using namespace __gun_cxx) Result:

From the running results, we can know that the STL used by different systems is different, and the Hash_table conflict resolution method of STL is not the same.

Reference:

1, "STL Source Analysis"

2, http://blog.csdn.net/shudou/article/details/11099931

3, Http://www.cplusplus.com/search.do?q=slist

Category: C/

"Reprint" STL "source code" analysis-Key Knowledge summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.