C++_benchmark_vector_list_deque

Source: Internet
Author: User
Tags benchmark intel core i7

Title:c++_benchmark_vector_list_deque
Date:2015-08-01 22:32:39

Titer1 + Zhangyu
Source: www.drysaltery.com + csdn Blog Sync
Contact: 13,073,161,968 (SMS only)
Disclaimer: This document is licensed under the following agreement: Free reprint-Non-commercial-non-derivative-retention Attribution | Creative Commons by-nc-nd 3.0. Reprint please indicate the author and source.


Translation Source: C + + Benmark series

C++_benchmark_vector_list_deque

Last week, I was compared to STD at different workloads: vectors and std:: Vector benchmark.

This previous article received a lot of comments and suggestions for improvements this article is a further expansion of the previous article.

In this post. We will take advantage of different data types under a few different workloads. Compare std:: vector, std:: List and std:: deque performance. In this article, the list, vector, and deque that I call are the corresponding functions of the standard library.

By common sense. The linked list should be used when random insertions and deletions are implemented. Because this is the case for arrays and bidirectional queues. Their relative complexity is O (n), while the former is O (1) Assuming we only look at complexity, the size of the linear search in both data structures (Vector/deque) should be equivalent and the complexity of O (n). When a random insert/replace operation is performed in an array (vector) or double-ended queue (deque). All the data may need to be moved, so the elements will be copied. That is why the size of the data type is an important factor in making the comparison between these two data structures. Because the size of the data type will play an important role in the cost of copying elements.

However. In practice. There is a huge difference in the use of memory fast caches. All array (vector) data is contiguous, and the list is allocated memory separately for each element. We will wait and see what the outcome of this is in practice.

A double-ended queue is a data structure that has the advantage of having two of these data structures without their drawbacks, and we'll see how it behaves in practice. The complexity analysis does not take into account the memory's hierarchical structure (memory hierarchy) I believe in practice. Memory hierarchy and complexity analysis are just as important.

Keep in mind that all the tests here are based on data (vector), linked list (list), and bidirectional queue (deque), even if other data structures perform better under the given load.

In the following diagram and body, n refers to the number of elements in the collection.

All tests are based on the Intel Core i7 Q 820 @ 1.73GHz. The code uses GCC to compile in 64-bit environments, with the use of 02 and-march=native. Oh, the code also supports the C + + 11 standard.

For each curve, the vertical axis represents the necessary time to run the operation, so the smaller the value the better. The horizontal axis is always the number of elements in the collection. For some graphs. The logarithmic scale can be clearer. A button is provided below the figure. Used to change the vertical scale of each graph to the corresponding logarithmic scale.

Types of data of different sizes, they hold the array of longs, and the size of the array changes to change the size of the data type. Those non-standard data types that are not mediocre in data types are made up of two bulls. With very foolish assignment operators and copy constructors, just doing some math (totally meaningless). But expensive). Some people might think that it is not a normal copy constructor, there is no common assignment operator and one will be correct, but the most important thing here is. It is expensive for operators which is enough to make this benchmark.

Fill operation

The first test run is to populate the data structure, in detail by adding elements (using push_back) to the tail of the container. There will be two arrays (vectors), in which Vector_pre is a standard vector that uses Vector:reserve to allocate memory at the beginning, and it allocates the result of allocating memory nearly once throughout the process.

First let's see. Result of populating with a very small data type (8 bytes)

The pre-allocated array fill is the fastest. There is a very small gap between the vector/deque and the. Then the list is 3 times times slower than the other three types.

Suppose we consider a larger size type (4k):

This time the vector and the list behave similarly, deque is a little faster than list/vector. The pre-allocated array is obviously the winner, and the difference between the deque and the vectors is likely to be caused by my system, and it is impossible to allocate that much memory at the same time on my machine.

(This translation waits for the code to explain)

Finally, we will try a non-trivial type:

The overall data performance is not very different, among them Vector_pre is the fastest.
For the push_back operation, the pre-allocated vector is a very good choice to assume beforehand knowing the allocated size: Other data structures show little difference.
I prefer pre-allocated arrays (pre-allocated), assuming that some people find the cause of these small gaps. Please tell me that I am very interested.

Linear Lookup

The first operation will be tested in a lookup, and the container will fill a random number between 0 and N. Of The ability to use Std:find for simple linear lookups to query every number from 0 to N.

Theoretically, all data structures should be consistent in terms of complexity.

8 bytes


Obviously, the list is very poor in terms of search, and his time overhead is growing more than Vector/deque.

bytes

The only reason is because the buffer line is used. When a data is interviewed. This data will be accessed from the primary to the buffer.

More than just the access data is taken, the other entire buffer line is taken. Since the data in the vector is continuous, when one of the data is visited, the neighboring element will voluntarily place itself in the cache.

Because main memory is slow to cache an order of magnitude, it makes a huge difference.

For the case of a linked list. The processor spends all its time waiting for data to be accessed from the primary to the buffer. For each data operation, the processor takes a large amount of unnecessary data that is almost useless.

Bidirectional queue deque is a bit slower than vector, which is logically the past, due to the presence of fragments (segmented part), causing many other caches to die.

Try a larger data type below,

The list is much slower than other types. But interestingly, the intermittent between the two-way queue and array is getting smaller, and the next step is to try 4KB
The size of the data type.

4096 bytes


The performance of the list is still very poor, but the gap with the other is getting smaller. The interesting point is that the two-way queue is faster than vector
I'm not sure why the result happened. It is very likely that this particular size of the data size.

One thing is certain.
The larger the size of the data unit, the more opportunities the processor takes to buffer rows, because the elements are not in the buffer.

For the lookup. The linked list is noticeably slower. At the same time deque and vectors have similar manifestations, and look deque than vectors in
Faster for large-size find operations.

Random find (+ linear lookup) 8 bytes

bytes

128bytes

4096bytes

Non-trival 16bytes

Random deletion 8

Http://drysaltery.qiniudn.com/2015-08-02_002644.png

128

Http://drysaltery.qiniudn.com/2015-08-02_002658.png

4096

Http://drysaltery.qiniudn.com/2015-08-02_002710.png

Non-trival 16bytes

Http://drysaltery.qiniudn.com/2015-08-02_002723.png

Front plug 8bytes

Http://drysaltery.qiniudn.com/2015-08-02_002823.png

Sort 8bytes

Http://drysaltery.qiniudn.com/2015-08-02_002859.png

128bytes

Http://drysaltery.qiniudn.com/2015-08-02_002908.png

1024x768 bytes

Not 4096bytes here

Http://drysaltery.qiniudn.com/2015-08-02_002923.png

Non-trival 16bytes

Http://drysaltery.qiniudn.com/2015-08-02_002945.png

Destroy Operation 8 bytes

Http://drysaltery.qiniudn.com/2015-08-02_003055.png

bytes

Http://drysaltery.qiniudn.com/2015-08-02_003105.png

4096 bytes

Http://drysaltery.qiniudn.com/2015-08-02_003121.png

...

This time. We can see that deque is three times times slower than vector. And list is three orders of magnitude slower than vector! However, performance differences are decreasing

When testing with a data structure:

The performance of list and deque is not much different. The vectors are still twice times faster than they are.

Although vectors are always faster than even list and deque, this is due to the very small cost of the destructor. Can run out in microseconds.

This affects only those programs that are sensitive to time, but most programs are less sensitive to time. In addition, each data structure is only refactored once. Therefore, destructors are not a very important operation.

Numeric operation (number crunching)

Finally, we will test number crunching. Here, random elements are inserted into an ordered container.

It means. The inserted position is searched first before inserting it. Here, number crunching has a test element of 8 bytes.


Even if there are only 100,000 elements. List is already one order of magnitude slower than the other two data structures. The result curve shows that the larger the amount of data, the worse the list behaves.

The list's spatial locality is too poor, so it is not suitable for number crunching operations.

结论

Finally, for the previous data structure, we can get the following facts:

    • STD:: List due to very poor spatial locality (spatial locality). The traversal of the list makes the collection very slow.
    • std:: vector and std:: deque performance is always better for small data than std::list
    • STD:: list for handling large data structures
    • std::d eque is better than std::vector in random insertion. Especially when inserting a lot of push_front.
    • STD:: deque and std: vector support for complex data types is not good, because the cost of copying and assigning is too high

We can draw the ideal scenario for using these data structures:

    • Number crunching: Using std:: vector or std:: deque
    • Linear searching (Linear search): Using std:: vector or std:: deque
    • Random insertions/deletions (random insert/remove):
      • Small data: Using std:: vector
      • Large data: Using std:: List just doesn't have a lot of search operations.
    • Complex data types (objects): Using std:: list. You just need to do a lot of searching.

      But. When we want to change the container multiple times. The list will become very slow.

    • Pre-insertion: Using std:: deque or std:: List

I confess, before writing this article, I do not know the std:: deque.

However, std::d eque is a very good data structure. Whether the element is inserted before, after, or in the middle of the queue, std::d eque has very good performance and has very good spatial locality (spatial locality). So. Even if performance is worse than vector at times, it is generally possible to say that we prefer deque rather than vectors. Especially when working with medium-sized data structures.

Assuming you have time, in practice, the best way to decide is to always benchmark each version number. Can even try a different data structure.

The two operations have the same large o complexity in practice with completely different performances.

I hope this article will be of some help to you. Let's say that you have no comments or suggestions.

Please do not hesitate. Leave a comment now.

Source: Https://github.com/wichtounet/articles/blob/master/src/vector_list/bench.cpp

C++_benchmark_vector_list_deque

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.