Learned some materials on the Internet.
This article: https://www.zhihu.com/question/30527705
AVL Tree: One of the earliest balanced binary trees. Applications are relatively small compared to other data structures. Windows management of the process address space uses the AVL tree red-black tree: Balanced binary tree, widely used in C++STL. Both the map and the set are implemented with red and black trees. We are familiar with the STL map container Bottom is rbtree, of course, is not referring to Unordered_map, the latter is a hash. B/b+tree used in disk File organization Data index and database index trie tree dictionary tree, used in statistics and sorting a large number of strings------AVL is a highly balanced two-fork tree, so the usual result is that the cost of maintaining this high level of balance is greater than the efficiency gains obtained from it, and therefore the actual application is not much,
More places are red-black trees with the pursuit of local rather than very strict overall balance. Of course, if the scene in the deletion of inserts is not frequent, only to find special requirements, AVL is better than the red black. Red and Black tree application is many, in addition to the above mentioned STL, there are epoll in the kernel implementation, with red black Tree management event block Nginx, With red black tree management timer and other Java TreeMap implementation of the famous Linux process scheduling completely Fair Scheduler, with the red and black tree management process Control blocks B and B+ be used for indexing in file system and database, such as mysql:b-A typical application of the tree Index in Mysqltrie is a prefix match, such as the following is a very common scenario, when we input, the search engine will give hints, such as IP routing, is also a prefix match, a certain extent will be used to trie------ Skip Table : Use a skip table instead of a red-black tree to store the elements in a redis (it should be said that the first level element-The direct key, the value inside should have a different data structure). First of all, the jump table is Skiplist? Not ziplist. Ziplist in Redis is a very memory-saving linked list (at the cost of slightly lower performance), so the number of hash elements is very small (for example, only dozens of),
Then using this structure to store it can save a lot of memory in the case of small performance loss (Redis is a memory database Ah, can save or save). OK, this is a clear question. On the server side, for concurrency and performance requirements, how to choose the right data structure (here is the jumping table and the red-black tree). If the simple comparison of performance, jumping table and red and black trees can say the difference is not small, but with the concurrency of the environment is not the same,
If you want to update the data, jumping table need to update the part of the less, lock things are less, so the cost of different line Cheng is relatively small,
The red and black tree has a balanced process, involving a large number of nodes , the price of contention is relatively high. Performance is not as good as the former. In the concurrency environment, Skiplist has another advantage, the red and black tree may need to do some rebalance when inserting and deleting, this kind of operation may involve the other parts of the whole tree ,
The skiplist operation is obviously more local, the lock needs to be pegged to fewer nodes , so in such a situation better performance.
In addition the Redis author describes the reasons for using a skip table:
See What the developer said, why he chose Skiplist the Skip Listthere is a few reasons: Memory Intensive
than Btrees.
Note: One drawback of a skip table is that it consumes memory (because you want to repeat the nodes), but the author also says that you can adjust the parameters to reduce memory consumption, which is almost the same as those of the balanced tree structure.
2) A sorted set is often target of many zrange or zrevrange operations, which is, traversing the skip list as A linked list .
This operation the cache locality of skip lists is at least as good as with other kind of balanced trees.
Note: Redis is checked for range operation, so it can be conveniently operated by using the doubly linked list inside the jump table. There are also cache locales (cacheslocality) that are not worse than the balance tree.
3) They is simpler to implement, debug, and so forth. For instance thanks to the Skip list simplicity I received a patch
(already in Redis Master) with augmented skip lists implementing Zrank in O (log (N)). It required little changes to the code.
Note: Simple to implement. The Zrank operation is able to go to O (log (N)).
about the Append only durability& Speed, I don ' t think it's a good idea to optimize Redis at cost of more code
And more complexity for a use case thatIMHOShould is rare for the Redis target (Fsync () at every command).
Almost no one is using this feature even with ACID SQL databases, as the performance hint is big anyway.
About Threads:our experience shows, Redis is mostly I/O bound. I ' m using threads to serve things from Virtual Memory.
The long term solution to exploit all the cores, assuming your link are so fast it can saturate a single core,
Is running multiple instances of the Redis (no locks, almost fully scalable linearly with number of cores),
and using the "Redis Cluster" solution that I plan to develop in the future.
There are some English abbreviations in the above article, organized as follows:
IMHO, IMO (in my humble opinion, in my opinion): It seems to me to be common in forums. IDK (I don ' t know): I don't know. ROFL (rolling on the floor laughing): Laugh and fall to the ground. Roflmao (rolling on the Laughing my "): the first two of the combined version, that is, super funny meaning. STH (something): something something. Nth (Nothing): No. plz: please. Please note that the end of the letter is a Z-tone, so the abbreviation is plz. Thx (thanks): Thank you. According to pronunciation, the KS of the thanks tail can be replaced by the letter x.
Comparison between the red and black tree and the B (+) Tree Project implementation:
Some of the answers are analyzed from the algorithm angle, I try to analyze the application of red-black tree and B + tree from the engineering angle, the red-black tree has one node to save a pair of kv, so it can be implemented in a way similar to the embedded list,
The data structure itself does not manage memory, is relatively lightweight, uses more flexibility and saves memory, such as a node that can exist in several trees or lists at the same time, which is more common in the kernel.
and B + tree because each node to save many pairs of kv,node structure of memory is generally by the data structure itself to pipe, is the true meaning of the container, relative to the implementation of the embedded method of red and black trees,
The advantage is simple to use, their own management of memory easier to do lockfree, a node save many to KV CPU cache hit ratio is higher, so the user-State implementation of high concurrent index is generally selected B + tree.
Again, B-tree and A + + tree, btree of the middle node than B + Tree more than the value, the same out of the case, node larger, relatively speaking CPU cache hit ratio is inferior to the B + tree.
In addition, the scanning characteristics of B + trees (linked list of leaf nodes) are difficult to do without locking (I have not yet seen the solution), so I now see no lock B + tree leaf nodes are not strung up.
The application scenarios of various data structures are analyzed from the point of view of their characteristic features:
Red and black trees, AVL trees are simply used to search for the Bai .
AVL tree: Balanced binary tree, generally determined by the balance factor difference and by rotation to achieve, left and right sub-tree tree height difference not more than 1, then and red black tree compared it is a strict balance of binary tree, the balance condition is very strict (tree height difference is only 1),
As long as the insert or delete does not meet the above conditions, it will be rotated to maintain balance. because the rotation is very time-consuming. We can launch an AVL tree suitable for inserting fewer deletions, but looking for more cases. Red-black tree: A balanced binary tree that constrains the color of each node on a simple path from the root to the leaf, ensuring that no path is twice times longer than the other path, and thus is approximately balanced.
As a result, the rotation of the AVL tree is less balanced relative to the strict requirements of the balance. When used for searching, we use red-black trees instead of AVL in the case of many insertions and deletions. ( Some scenes now use a skip table to replace the red-black tree , which can be searched for "Why Redis uses a jump table (skiplist) instead of red-black? ") B-Tree, b+tree: They are characteristic of the same, is the multi-path search tree, generally used in the database system, why, because they branch number of multi-layer Bai ,
All know that disk IO is very time consuming, and like a lot of data is stored on disk so we have to effectively reduce the number of disk IO to avoid frequent disk lookups. B+Tree is a variant tree of B-tree, there are n subtrees tree nodes contain n keywords, each keyword does not save data, only for the index, the data are stored in the leaf node . is for the file system . Trie Tree: aka Word search tree, a tree structure, commonly used to manipulate strings. It is the same prefix of a different string to save only one copy. Saving a string relatively directly is certainly space-saving, but it consumes memory (memory) when it saves a large number of strings.
similar prefix trees (prefix tree), suffix trees (suffix tree), radix (Patricia Tree, compact prefix tree), Crit-bit tree (solves memory-consuming problems),
and the double array Trie said earlier. A simple addition to my understanding of the application prefix tree: Fast string retrieval, string sorting, longest common prefix, auto match prefix display suffix. suffix tree: Find string S1 in S2, string s1 the number of occurrences in S2, string s1,s2 the longest common part, longest palindrome string. Radix tree:linux Core, Nginx.
Red and Black tree Introduction can read these two articles: the history of the clearest red and black tree explanation (top) + (bottom)
Http://mt.sohu.com/20161014/n470317653.shtml
Http://mt.sohu.com/20161018/n470610910.shtml
when the structure of the find tree changes, the conditions of the red-black tree may be destroyed, and the search tree needs to be adjusted to satisfy the red-black tree's condition again. adjustment can be divided into two categories: one is color adjustment, that is, change the color of a node, the other is the structure adjustment, set to change the structure of the search tree relationship. The structure adjustment process consists of two basic operations: left-hand (Rotate left), right-handed (rotateright) Remember, no matter how many cases, there are only two specific adjustments:1. Change the color of some nodes, 2. Rotate some nodes.
Another article analyzes the various algorithms related to strings, as well as the various data structures used, including prefix tree suffix tree and various trees.
Red-black Tree, B (+) tree, jump table, AVL and other data structures, application scenarios and analysis, and some English abbreviations