Java TreeMap Source Code parsing

Source: Internet
Author: User
Tags comparable


This article begins by introducing another more important class treemap of the map series. You may be able to feel that the introduction of HashMap on the network more, but the introduction of TreeMap is not so much, there is a reason: on the one hand, hashmap the use of more than the scene, the second is relative to HashMap, the data structure used by treemap more complex. Don't say much nonsense, get to the point.

Signature (signature)

As can be seen, compared to HashMap, TreeMap inherited an interface Navigablemap, which is the interface that determines the difference between TreeMap and HashMap:

HashMap key is unordered, TreeMap key is ordered

Interface Navigablemap

First look at Navigablemap's signature.

1 publicinterface NavigableMap<K,V> extendsSortedMap<K,V>

Found Navigablemap inherited the SortedMap, and then look at SortedMap's signature.

SortedMap
1 public interface sortedmap<k,v> Extends MAP<K,V>

SortedMapAs its name suggests, the map is orderly. This order generally refers to the natural order (natural ordering) of keys provided by the comparable interface, or it can be determined by specifying a comparator when the SortedMap instance is created. When we iterate (iterate) a SortedMap instance by using a set view (collection views, as with HashMap, also provided by EntrySet, keyset, and the values method), the order of key is shown. The difference between comparable and comparator is extended here (see here):

    • Comparable generally represents the natural order of a class, such as defining a student class, which is the default ordering

    • Comparator generally represents the special classification of classes in some situations and requires a custom sort. Like now I want to sort by the age of the student class.

The class class that inserts the key in SortedMap must inherit the comparable class (or specify a comparator) in order to determine how to compare (pass k1.compareTo(k2) or comparator.compare(k1, k2) ) two keys, otherwise, the exception is reported when inserting ClassCastException . For this, the order of key in SortedMap should be equals consistent with the method. k1.compareTo(k2)that is comparator.compare(k1, k2) , or true, k1.equals(k2) should also be true. After the introduction of SortedMap, come back to our navigablemap. Navigablemap is new to JDK1.6, and on the basis of SortedMap, a number of "Navigation methods" (navigation methods) are added to return the element closest to the search target. For example, the following methods:

    • Lowerentry, returns all elements smaller than the given Map.entry

    • Floorentry, returns all elements that are smaller or equal than the given Map.entry

    • Ceilingentry, returns all elements larger or equal than the given Map.entry

    • Higherentry, returns all elements larger than the given Map.entry

Design concept concept red and black (red–black tree)

TreeMap is based on red and black trees, red and black tree is a binary search tree, let us recall the next two search tree of some properties

Binary search Tree

Let's look at the binary search tree (TREE,BST).

Binary search Tree

I believe you are not familiar with this diagram, the key point is:

The value of Zuozi is less than the root node, and the right subtree value is greater than the root node.

The advantage of a binary search tree is that each decision is to reduce the size of the problem by half, so if the binary search tree is balanced, the time complexity of finding the element is the height of the log(n) tree. I think of a more serious problem here, if the binary search tree will reduce the size of the problem by half, then the three-fork search tree will not reduce the problem size by two-thirds, it is not better, and so on, we can also have four fork search tree, five forks search tree ... For a more general scenario:

n elements, K-Tree search tree K for how much efficiency is the best? When you k=2?

K-Fork Search Tree

If you follow my analysis above, it is possible to also fall into a misunderstanding, is

Three-fork search tree When the problem size is reduced by two-thirds, the number of comparison operations required is two times (binary search tree and then reduce the size of the problem by half, only need one comparison operation)

We can not ignore these two times, for the more general situation:

n Elements, a K-Tree search tree requires an average number of comparisons k*log(n/k) .

For extreme cases k=n, the K-fork tree is converted to a linear table, and the complexity is O(n) , if the problem is solved mathematically, it is equivalent to:

When n is a fixed value, what is the value of K when the value is k*log(n/k) minimized?

k*log(n/k)According to the logarithm of the operation rules can be converted to ln(n)*k/ln(k) , ln(n) as a constant, so the equivalent to take k/ln(k) the minimum value. This question is simple enough for a freshman to learn a higher number, and we can see the results directly here.

When K=e, k/ln(k) take the minimum value.

The natural number e value is about 2.718, you can see that the two-fork tree is basically the best solution. Do the following in the Nodejs REPL

Seemingly k=3 than k=2 when the results are smaller, that is to say the three-fork search tree should be better than the binary search tree, but why binary tree more popular? Later on the almighty StackOverflow found the answer, the main thrust is as follows:

Now the CPU can be optimized for binary logic code, and the triple logic will be decomposed into multiple binary logic.

It's probably understandable why binary trees are so popular, because we can reduce the size of the problem by half by doing a comparison operation. Okay, here's a bit far, and we'll go back to the red and black trees.

Red Black Tree Nature

Look at the red and black trees first:

Red and Black Tree example

is from the wiki, the point to be explained is:

Leaf node for the nil node, some of the domestic textbooks do not have this nil node, we also sometimes omit these nil nodes in the drawing, but we need to be clear, when we say the leaf node, refers to these nil nodes.

The red and black trees ensure that the tree is balanced by the following 5 rules:

    1. The nodes of the tree are only two colors of red and black.

    2. The root node is black

    3. The leaf node is black.

    4. The byte point of the red node must be black.

    5. From any node, the number of black nodes is the same in the path to the subsequent leaf node

After satisfying the above 5 conditions, you can guarantee: 根节点到叶子节点的最长路径不会大于根节点到叶子最短路径的2倍 . In fact, this is very good understanding, mainly using the nature of 4 and 5, here simply say:

Assuming that the root node to the shortest path of the leaf node, the number of black nodes is B, then according to the nature of 5, the root node to the longest path of the leaf node, the number of black nodes is B, the longest case is a red node in the middle of each of the two black nodes (that is, red and black), so the red node This will prove the above conclusion.

Red and Black tree operation

Red and Black Tree rotation example (no nil node drawn)

About the red and black tree Insert, delete, left, right-hand these operations, I think the best can be visualized, text expression is more cumbersome, I here is not caught dead, online can find more, like V_july_v's "teach you a thorough understanding of the red and black tree." I recommend a SWF teaching video (video in English, do not be afraid, the focus is to look at the picture?? ), 7 minutes or so, we can refer to. There is also an interactive Visual Web page of red and black trees , you can go up to their own operation, inserting a few nodes, delete a few nodes to play, to see how the left-hand rotation is how to play.

SOURCE Analysis

Due to the operation of the red and black tree I do not say here, so there is basically no source code can be said, because the important algorithms are From CLR , here the CLR refers to Cormen, Leiserson, Rivest, they are the author of the introduction of the algorithm, That is to say treemap inside the algorithm is the reference algorithm introduction pseudo code. Because the red-black tree is a balanced two-fork search tree, the time complexity of its put (including the update operation), get, and remove is all log(n) .

Summarize

So far, the implementation of TreeMap and HashMap have been introduced, you can see the difference in their implementation, determine their application scenarios are different:

    • TreeMap Key is ordered, adding or deleting the time complexity of the operation to ensure that the O(log(n)) red and black tree balance, when necessary, will be rotated

    • HashMap key is unordered, adding and removing changes to check the time complexity of the operation, O(1) in order to achieve dynamic expansion, when necessary, will be resize.

In addition, I do not explain the specific code here, inevitably some of the title party, please forgive me, the later understanding of the more profound again to pits.

Java TreeMap Source Code parsing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.