Java Collection Class in-depth analysis of the Treemap/treeset chapter

Last Update:2018-07-27 Source: Internet

Author: User

Tags addall comparable

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprint Address: http://shmilyaw-hotmail-com.iteye.com/blog/1836431

Brief Introduction

TreeMap and TreeSet are some of the more difficult data structures in Java collection classes. Unlike ordinary hashmap, the time complexity of ordinary hashmap elements is generally the range of O (1). In TreeMap, the operation complexity of an element is O (logn). Although TreeMap is not an advantage in terms of access to elements, its internal elements are sorted, which can produce desirable results when looking for elements and sequential output elements. It can be said that TreeMap is an internal element sort version of the HashMap. Here is a detailed introduction to the specific implementation mechanism within the TREEMAP and the red-black tree on which it is based. In addition, for specific JDK inside the detailed implementation of TREEMAP, here will also do a detailed analysis. the relationship between TreeMap and TreeSet

Like the previous article, the interesting thing about this is that there seems to be a map and set where the set is almost a vest of map. How do you say that? In the previous discussion of the detailed implementation of HashMap and HashSet, we found that the detailed implementation of HashSet was achieved by encapsulating a HASHMAP member variable. Here, TreeSet is no exception. Let's look at some of the code first:

The member variable is declared:

Here Navigablemap itself is an interface implemented by TreeMap. Let's look at the following implementation related to the constructor:

TreeSet (navigablemap<e,object> m) {this.m = m;}

Public TreeSet () {//No parameter constructor this (new treemap<e,object> ());}

Public TreeSet (comparator&lt. Super e> Comparator) {//constructor that contains comparer this (new treemap<> (Comparator));}
    Public TreeSet (COLLECTION&LT, extends e> c) {this ();
AddAll (c);
    Public TreeSet (sortedset<e> s) {This (S.comparator ());
AddAll (s); public boolean AddAll (collection&lt. Extends e> c) {//Use Linear-time version if applicable if (M.size ()
        ==0 && c.size () > 0 && C instanceof sortedset && m instanceof TreeMap) { sortedset<?
        Extends e> set = (sortedset<? extends e>) C;
        treemap<e,object> map = (treemap<e, object>) m; comparator<?
        Super e> cc = (comparator< super e>) Set.comparator (); comparator<?
        Super E> MC = Map.comparator (); if (CC==MC | | (CC!= NULL && cc.equals (MC)) {Map.addallfortreeset (set, PRESENT);
        return true;
} return Super.addall (c); }

Here, the code for the related part of the constructor looks more in fact, the main constructor is two, one is the default parameterless constructor and a comparer constructor, their internal implementations are treemap, and other related constructors are implemented by calling these two. So the bottom of the use is treemap. Since TreeSet just treemap a vest, we just focus on the treemap inside the realization of the good.

Red and black trees (red-black tree)

The red-black tree is essentially a two-fork search tree with a certain degree of relative balance. Why do you say that? We can see this from the previous article discussing the binary search tree. A binary search tree Ideally, the time complexity of the search and other element operations is O (LOGN). However, this is based on the premise that the binary search tree itself constructs the tree is balanced. If we put an element in the normal way by inserting a binary tree, we will lose balance in some extreme cases. For example, by inserting a group of elements that are ascending or descending in order, the final structure is equivalent to a two-way list. Access to elements in it is also unlikely to reach the level of O (Logn).

So, in this case, we would like to have a mechanism or a data structure that guarantees that we can construct a binary search tree, and that it is inherently balanced. So there is the red and black tree. Of course, in order to achieve these two goals at the same time, the red-black tree sets some specific attribute limits and makes its own implementation more complex. We can see it in the definition below.

The official definition of the red and black tree is as follows:

The red-black tree is a binary tree, and it also satisfies the following 5 features:

1. Each node is red or black.

2. The root node is black.

3. Each leaf node is black. (Here the left and right empty nodes of the leaf node are treated as a special node, setting them to be black.) ）

4. If a node is red, its left and right child nodes must be black.

5. For any node, all paths from it to the leaf node must contain the same number of black nodes.

The definition of this part is a bit confusing, so let's look at a sample red and black tree first:

Assuming the shaded node is a red node, the image above is a red-black tree. Suppose we take the root node to investigate, it goes to any leaf node to walk through 3 black nodes. In this way, from any node to the leaf node only need to experience the same black node, it can be said that this is a relaxed balance measurement. node Definition

Now, combining our previous discussion of the balanced binary search tree and the characteristics of the TREEMAP requirements, let's do an analysis. We require the design of the treemap It is essentially a map, then it means that for any name value pairs, we need to keep in the data structure. For a name-value pair, the function of a key is to be addressed. In HashMap, key is mapped directly to the corresponding slot by hash function operation, where it is placed in a binary tree in a suitable position by looking for comparisons. This position is equivalent to a slot. So our node must have a key, a value.

In addition, given that the nodes are defined as red and black, you need to have a property that preserves the color of the nodes. When we discuss the insertion and deletion of elements when we talk about binary search trees, it is very convenient to add a reference to the parent node for each element. Since the red-black tree is one of them, this reference must be considered.

To sum up, our nodes should contain the following 6 sections:

1. Left child node reference

2. Right child node reference

3. Parent Node Reference

4. Key

5. Value

6. Color

This structure is equivalent to a figure as follows:

In the implementation of the JDK, it is defined as follows:

Static Final class Entry<k,v> implements map.entry<k,v> {
    K key;
    V value;
    Entry<k,v> left = null;
    Entry<k,v> right = null;
    entry<k,v> parent;
    Boolean color = black;

    /**
     * Make a new cell with given key, value, and parent, and with
     * {@code null} child links, and black color.
  
   */
    Entry (K key, V value, entry<k,v> parent) {
        This.key = key;
        This.value = value;
        This.parent = parent;
    }
    
    // ... Ignored
}

It is the inner class that is defined as entry.
adding elements

The process of adding elements can be roughly divided into two steps. Similar to the previous two-fork search tree, we add elements by comparing the values of the elements to find where to add the elements. There's basically no change in this part. The second step is a process of adjustment. Because the red and black trees are different, when we add a new element, we may break its intrinsic properties. Mainly in two places, one is to ensure that after the new element, the black nodes to all leaf nodes are still the same. Also make sure that the child nodes of the red node are black nodes.

Another is that, combined with the TreeMap map feature, when we add elements may also appear that the new element key already exists in the middle of the number, then this time is not a new element, but to update the value of the original element.

With these big ideas mentioned earlier, let's take a look at the code that adds elements:

Public V-Put (K key, V value) {entry<k,v> t = root; if (t = = null) {Compare (key, key);//type (and possibly null) check root = new Entry<> (key, value
        , null);
        size = 1;
        modcount++;
    return null;
    int CMP;
    Entry<k,v> parent; Split comparator and comparable paths comparator<?
    Super k> CPR = comparator;
            if (CPR!= null) {do {parent = t;
            CMP = Cpr.compare (key, T.key);
            if (CMP < 0) T = t.left;
            else if (cmp > 0) t = t.right;
        else return T.setvalue (value);
    while (t!= null);
        else {if (key = = null) throw new NullPointerException (); comparable<?
        Super k> K = (comparable<? super K>) key;
            do {parent = t;
            CMP = K.compareto (T.key); if (CMP < 0) T = T.left;
            else if (cmp > 0) t = t.right;
        else return T.setvalue (value);
    while (t!= null);
    } entry<k,v> e = new entry<> (key, value, parent);
    if (CMP < 0) Parent.left = e;
    else Parent.right = e;
    Fixafterinsertion (e);
    size++;
    modcount++;
return null; }

The above code looks more, but it's not really complicated. The 3rd to 9th line is primarily to determine that when the root node is null, our put method is equivalent to creating a node directly and associating it with the root node. The two large if else blocks at the back are used to determine whether the comparison is set and the element action is added to the comparator. For some common data types, they implement the comparable interface by default, so we use the CompareTo method to compare them. And for some custom implementations of the classes, their comparison in some special cases need to implement the comparator interface, which is why the preceding two parts to be differentiated. The main thing to do in these two big blocks is to find the place where you want to add the elements, and if you have the same key, replace the original value directly.

The 42nd and subsequent sections need to handle adding elements. If the corresponding key value is not found in the previous loop block, the location where the element needs to be inserted is found, and the place is added here. After you add an element, the whole process ends.

Here's a method fixafterinsertion (), as mentioned in our previous discussion. Every time we insert an element, the element we add comes with a color, and this color, whether red or black, can break the attributes defined by the red and black tree. Therefore, there is a need to determine the adjustment process to ensure that the addition of elements after the entire tree is still in line with the requirements. This part of the process is more complex, we take apart to a little bit more detailed talk.

Before we look at the implementation of Fixafterinsertion, let's look at the left and right rotation of the tree. This thing is used very much in the fixafterinsertion. Rotate

The process of the left and right rotation of a tree is simpler and more intuitive by using a graph:

As you can see from the diagram, our left and right spins are mainly exchanged for the position of two nodes, while the child nodes of one node are transformed into child nodes of another node. In the case of left-handed, X is the parent node of y before rotating. After the rotation, Y becomes the parent of X, while the left child node of Y becomes the right child node of X. X the original parent node becomes the parent node of the back Y. Such a toss process becomes a left spin. Similarly, we can also get the right spin process.

The implementation code for the left and right spins is as follows:

private void Rotateleft (Entry<k,v> p) {
    if (P!= null) {
        entry<k,v> r = p.right;
        P.right = R.left;
        if (r.left!= null)
            r.left.parent = p;
        R.parent = p.parent;
        if (p.parent = = null)
            root = R;
        else if (p.parent.left = = p)
            p.parent.left = R;
        else
            p.parent.right = r;
        R.left = p;
        P.parent = r;
    }
}

private void Rotateright (Entry<k,v> p) {
    if (P!= null) {
        entry<k,v> L = p.left;
        P.left = l.right;
        if (l.right!= null) l.right.parent = p;
        L.parent = p.parent;
        if (p.parent = = null)
            root = l;
        else if (p.parent.right = = p)
            p.parent.right = l;
        else P.parent.left = l;
        L.right = p;
        P.parent = l;
    }
}

This part of the code combined with the previous diagram to look at the words is relatively simple. The main is the child node movement and the determination of the parent node and adjust. Sort of like a two-way list of intermediate adjustment elements. Adjustment Process

We know that in the red and black trees, if you join a black node, it will cause all the paths through this node black node number increased by 1, this will certainly destroy the red and black tree to all the leaf nodes through the same number of black nodes. So, the easiest way to do this is to set the nodes that are joined in red first. This will not undermine this agreement. However, this adjustment will also bring another problem, if I want to join the node its parent node is already red. Did this not destroy the original agreement again? Yes, in this case, we are going through a series of adjustments to ensure that eventually it becomes a qualified red-black tree. But this is a little narrower than when we add a black node and then adjust it to a relatively narrow range. Now let's take a look at how to adjust the law.

We assume that the node to be added is N.

The scene 1:n node's parent node p and P's sibling nodes are red, and its grandparent node G is black

In this case, simply set its parent node p and node u to black, while the grandparent G is set to red. This ensures that any black nodes that pass through the G to the following leaf nodes are still the same as the original 1. It also guarantees that the Red node's child nodes are not red. One prerequisite for this scenario is that the same technique is used to convert the node to be added and its parent node as well as the sibling nodes of the parent node to be red. This has nothing to do with the node being joined as the left and right child nodes of the parent node.

The scene 2:n node's parent node p is red, but its grandparent node G and its parent node's sibling node U are black.

This situation actually depends on the position of the element n to be inserted, if it is the right child node of P, do a left-turn operation first, and convert it to the right side of the case. In this way, the newly joined node is guaranteed to be the left child node of the parent node. After making this transition in the diagram above, we need to make the next adjustment, as shown in the following figure:

This step is to rotate the p and G to the right, so that G becomes the right child node of P. Then change the color of p to black and the color of G into red. This ensures that the new part of the subtree still contains the same black child nodes.

Our discussion of these two situations in the first part covers such a large situation, that is, assuming we are new to the node n, its parent node p is the left child node of the grandparent node G. On such a premise, let us consider whether the previous scenarios are sufficiently complete. We know that this need to adjust the situation is necessarily the new node n and the parent node p appear the same color is red. So, when they are red and the parent node p is the left child node of the grandparent node G, there are only two possibilities for the sibling node of p, either red or black. Both of these situations are exactly the same as those we discussed earlier.

If the parent node p is the right child node of the grandparent node G, the situation is symmetric with the condition as the Zoozi node. We can do it in a similar way.

private void Fixafterinsertion (Entry<k,v> x) {x.color = RED; while (x!= null && x!= root && X.parent.color = = RED) {if (parentof (x) = = Leftof (Parentof (paren TOf (x))) {Entry<k,v> y = rightof (Parentof (Parentof (x)));//The Uncle node of the current node if (colorof (y) = = R
                ED) {//Uncle node is also red, it satisfies the first case: the parent node and Uncle Node are set to black, and the grandparent node is red.
                SetColor (Parentof (x), black);
                SetColor (y, black);
                SetColor (Parentof (Parentof (x)), RED);
            x = Parentof (Parentof (x)); else {if (x = = Rightof (parentof (x)) {//In the second case, the node is the right child node of the parent node, so let's turn left x = Parentof (
                    x);
                Rotateleft (x); SetColor (Parentof (x), black);
                In the second case, the parent node and the grandparent do a right-handed operation, then the parent node becomes black, and the grandparent node becomes a red SetColor (Parentof (Parentof (x)), and a reddish);
            Rotateright (Parentof (Parentof (x))); } else {entry<k,v> y = Leftof (Parentof (Parentof (x)));
                if (colorof (y) = = RED) {setcolor (Parentof (x), black);
                SetColor (y, black);
                SetColor (Parentof (Parentof (x)), RED);
            x = Parentof (Parentof (x));
                    else {if (x = = Leftof (parentof (x))) {x = Parentof (x);
                Rotateright (x);
                SetColor (Parentof (x), black);
                SetColor (Parentof (Parentof (x)), RED);
            Rotateleft (Parentof (Parentof (x)));
}} Root.color = black; }

The condition of the while loop in the preceding code is to determine whether the current node has a parent node and whether the parent node's color is the same as it is red. The elements we add by default are set to red. I made a comment in the code that the parent node is the left child of the grandparent node. Another kind of situation can also according to gourd painting scoop of analysis. Delete Element

The process of deleting an element is generally similar to the search process for a common binary search tree, starting with the analysis of the node to be deleted:

1. The node to be deleted has no child nodes, and the node is deleted directly. The following figure: 2. The node to be deleted has a child node that replaces its parent node with this child node:

3. The node to be deleted has two child nodes, replaces it with its successor node, and deletes the original location of the successor node. It may have two kinds of situations:

These are the processes that delete elements in a binary search tree. Here is no longer to repeat. We mainly look at the red and black trees in some different places. Here is the main code to delete the method implementation:

private void DeleteEntry (Entry<k,v> p) {modcount++;

    size--;
    If strictly internal, copy successor ' s element to P and then make P/s to successor.
        if (p.left!= null && p.right!= null) {entry<k,v> s = successor (p);
        P.key = S.key;
        P.value = S.value;
    p = s;
    }//P has 2 children//Start fixup at replacement node, if it exists.

    entry<k,v> replacement = (P.left!= null? p.left:p.right);
        if (replacement!= null) {//Link replacement to parent replacement.parent = p.parent;
        if (p.parent = null) root = replacement;
        else if (p = = p.parent.left) P.parent.left = replacement;

        else p.parent.right = replacement;
        The Null out links so they are OK to use by fixafterdeletion.

        P.left = P.right = P.parent = null; Fix replacement if (P.color = black) fixafterdeletion (replacement);
        else if (p.parent = null) {//return if we are the ' only node.
    root = null; else {//No children.
        Use self as phantom replacement and unlink.

        if (P.color = black) fixafterdeletion (p);
            if (p.parent!= null) {if (p = = p.parent.left) P.parent.left = null;
            else if (p = = p.parent.right) P.parent.right = null;
        P.parent = null; }
    }
}

The 7th to 12th line of code is to judge and process the node to be deleted if there are two child nodes. By locating its successor node, it overwrites the current node with the value of the successor node. After this step is completed, the main follow-up is to delete the original successor node. The 15th line and later code is mainly to deal with the deletion of this node thing. Of course, given the characteristics of the red-black tree, here are two places to determine if the node is black at the moment. We know that if the currently pending deletion node is red, it will not have any disruptive effect on the current tree after it has been deleted. If the deleted node is black, this requires further adjustment to ensure that the subsequent tree structure meets the requirements. This is why it is necessary to invoke the Fixafterdeletion method. adjusted after deletion

The adjustment after the deletion of an element is more complex than the previous insertion element adjustment process. It is not a simple reverse in the original process. Let's start with a very basic point. First of all, the point we want to adjust is definitely because the point we want to remove destroys the nature of the red and black tree. And if the point we delete is red, it certainly doesn't break the attributes inside. Because from the previous removal process, the point we want to remove is already near the edge of the leaf node, it has either a child node or a leaf node. If it is red, it is deleted, and the black node that goes through the node above to the leaf node does not change. So one of the preconditions here is that the node to be deleted is black.

In front of that premise, we want to adjust the red-black tree is to ensure that this is the original black node is removed, we have to pass a certain change, so that they are still legitimate red-black tree. As we all know, after a black node is removed, there is one less black node from the node above to the path of the leaf node in which it resides. We need to make some adjustments so that it can be mended somewhere in the back.

OK, with this part of the understanding, we look at the adjustment of the nodes in several cases.

1. The current node and its parent node are black, and its sibling nodes are red:

In this case, since its sibling node is red, from the properties of the red and black tree, its sibling node must have two black subnodes. This is done by the parent node of node X, then the parent node B color turns red, and the original sibling d turns black. So we turn the tree into a situation in the second case. The tree changed so much before it changed to keep the balance.

2.1) The parent node of the current node is red, and its sibling node, including all child nodes of the sibling node, is black.

In this case, we set the sibling node to red, and then the X node points to its parent node. Here is a more difficult place to understand, that is why I have been so changed after the balance of it. Because we assume that a node is going to adjust the node all the way over. Because the original to adjust the node is black, it once deleted on the path of the black node 1 less. So here's the path where a is all black nodes less than 1. After the sibling of a is turned red, all paths from its parent node to the following are uniformly less than 1. Guaranteed to be balanced at last.

Of course, we will have a worry, is the current adjustment is only a tree in the middle of the word, where the head of Node B may also have a parent node, so all the way up to the root node. You are such a tree less than a black node, it is not enough to ensure that the collation of qualified. Here is a guarantee in the code. Let's say the B is already red. Then the loop block in the code jumps out, and the last part will be black for the Node B, which is where x is pointing. This ensures that a black node in front of the deficit is back.

2 The parent node of the current node is black, and its sibling node, including all child nodes of the sibling node, is black.

This is similar to the previous comparison. If you go on to the previous discussion, after doing the action that put the sibling in red, all the child nodes starting at parent B are 1 less. So here's the code, because X points to the parent node, and it's still black. At this time, the parent node B as the base of the subtree is less black node 1 below. And we're going to go ahead in one of these situations.

3. The parent node of the current node is red, and its sibling is black, while a sibling node has a red node.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More