This article is original, reprint please specify: http://www.cnblogs.com/gistao/background
On a detailed analysis of the source code once again, and the source behind the design of the idea is not written, design ideas are often the most important, without it, the basic can not do the whole optimization or correct use,
But according to the results of the reasons are difficult, but also very easy to not in place, here ' stumbling ' to write their own understanding, in addition to the source of the ' problem ' also written out.
Simple
Debugging a multithreaded program is a headache, and using atomic to write a correct multithreaded data structure is more difficult, out of the question is generally not a random problem, and waiting to be reproduced to see the log bar,
So the simplicity of this feature should be the first in the design.
Atomichashmap's key only supports int, why doesn't IT support a custom type of key like the concurrent_hash_map of TBB? It is entirely possible to position the existing key as
Pure state machine, and then set a field to save the custom key, I think is simple, because the user can use the hash algorithm to convert the custom key to int to solve. This also
Saves a pointer space footprint and is simple enough.
28 Principles
The 28 principle here is that 80% of the CPU executes 20% of the code. Rehash is an essential function of hashmap, but it is clearly not within the scope of the 20% code.
So, Atomichashmap can not support the traditional rehash, on the one hand is the atomic ability limit, on the other hand is rehash enough complex efficiency low enough, but the Facebook engineers chose
Let the 80% CPU execute fast enough, and the rest of the 20%cpu is a little bit lower to accept the idea.
Atomichashmap's rehash is similar to Dequue's expansion strategy, which will have two conclusions
- When the capacity is not full, this is 80% probability event, still can O (1)
- When the capacity is full, this is 20% probability event, can still O (2), O (3), O (4 ...)
Simple +28 Principles
Atomichashmap's conflict resolution strategy is linear probing, which affects the insertion of other keys because of this conflict, and the zipper method does not have this problem. Why Facebook's Engineers
Choose this, I think the first conflict is also a 20% probability event, then the code efficiency is acceptable, and then this zipper is actually a multi-producer multi-consumer mode of the queue,
See a lock-free implementation of boost, which is much more complex than linear probing, and the advantage of a linear probe is that it is good locality.
O (1) Turn O (N)
Watch a scene.
step1:100 concurrency, each concurrent do 100 random insertions, Atomichashmap size set to 10w, a total of 14w data inserted.
Step2:2 a concurrent query, the key of the query does not exist
STEP3:CPU idle down to 0
How to Solve
Through the source analysis of the previous article, we can conclude that there are three suspects.
- The key to check is a conflict , so the best thing to do is to traverse one or several (depending on the hash algorithm and size) to find the element,
Or find an empty element, and then end the lookup; the worst-case scenario (where all the space in the map is occupied) is to iterate over the search ,
Of course, this is a small possibility, depending on the fill factor and the concurrency of the insertion
- To check the key has not been inserted , then the best case is to encounter the first empty element to end the lookup, the worst case is to traverse to find again
- The key to be checked has been removed , as in this case
The conclusion is that the distribution of elements over space is very important, and there are three kinds of distribution
- All elements are used, no empty elements
- The used elements are centrally distributed, and the corresponding empty elements are centrally distributed.
- separated/evenly distributed by used elements and empty elements
The latter two distributions mainly depend on the hash algorithm, and the good hash algorithm can ensure that the input of each bit change is reflected in the output.
The hash algorithm used in the test scenario is a general murmurhash, and the effect of this algorithm is quite good, and in actual use it does not occur in these two cases.
So the problem is only the first distribution: no empty elements
Insertinternal (Keyt key_in, t&& value) { ... if (Isfull_.load (std::memory_order_acquire)) return false;//full, do not allow to insert this map again ++numentries_;//number inserted if (Numentries_.readfast () >= maxentries_) { Isfull_.store (true, std::memory_order_relaxed);//isfull set to True ......}
This is the insertion logic for atomichashmap: Inserting is not allowed when full. Shouldn't there be no empty elements in that case? No
Maxentries_ is 10w and the fill factor is 0.8, then the capacity of the element is =12.5w (10/0.8), then the capacity of the empty element is 2.5w (12.5w-10w).
But why does the empty element have a capacity of 0?
Numentries_ is the Thread_cache_int type, the general idea of this class is that you can configure a cache_size, then access this Numentries_ object
All threads of the local variable + + to Cache_size will not be synchronized to other threads (that is, Readfast can obtain).
For example, if the Cache_size is configured to 1000 and the number of threads is 100, the maximum delay of readfast in theory is 100*1000=10w.
10w is much larger than the capacity of the empty element 2.5w, that is, it is possible that the actual insertion element is full (empty element capacity is 0), and Isfull_ is still false.
In the face of this problem, the solution is to increase the capacity.
Folly::atomichashmap Source Analysis (ii)