Balanced binary tree and red black tree worst case performance analysis

Source: Internet
Author: User

Balanced binary tree and red black tree worst case analysis

1. Classic Balanced binary tree

The Balanced binary tree (also known as the AVL tree) is a two-fork search tree with equilibrium conditions, with the most common theorem: a balanced binary tree is a two-fork lookup tree with a maximum height difference of 1 for each node of the Saozi right subtree . Because he is a concrete application of the two-fork tree, he also has the properties of a binary tree. For example, a tree full of two forks can have a maximum of one node (property 1) in the K layer. The height of a tree is the number of paths it passes from the root node to the lowest node (for example, a tree with only one node has a height of 0) (Property 2). And it has been shown that a Balanced binary tree with N nodes has the highest height (roughly speaking) .

Let's try to summarize how to get a balanced binary tree with the highest difference in height (find the worst performance).

By the definition of a balanced binary tree, the Saozi right subtree can be up to a difference of 1 levels, then multiple sub-trees in the same layer can be in turn to decrease the height of the subtree in a 1-layer way, as shown in the case of a tree with 4 subtree of the maximum difference in height (Figure 1):



Figure 1

The tree in the dashed box of the graph, the height of the node tree at the leftmost end is 0, the height of the node tree at the far right is 2, so the maximum height difference of the inner subtree of the balanced binary tree is 2.

Using this nature, we can be recursive in turn, 2 subtrees tree can be 1 levels apart,4 subtrees trees can be the highest difference of 2 layers,8 subtrees trees can be different 3 layers, the subtrees tree can be 4 levels apart ...

You can get:

n The maximum height difference of the subtrees tree is

Further analysis, it is assumed that the maximum difference of a balanced binary tree with a height of h is m(assuming a minimum subtree height of 0) and m The height difference is achieved by n subtrees trees. By the nature of the Balanced binary tree (properties 1,2), the formula:


In the end, it is surprising to get a simple conclusion:


That is, a balanced binary tree with a height of h , the maximum height difference of its inner subtree can be reached (result rounding, not rounded). For example, a balanced binary tree with a height of 8 , with a maximum height difference of 4for its inner subtree , and,Similarly, a balanced binary tree with a height of 9 , the maximum height difference of the inner subtree is 5, as shown below (Fig. 2):

Figure 2

2. red and black trees

One variant of the history of AVL trees is the red black tree . Red and black trees are also the most widely adopted data structures for many programming languages (e.g. Java TreeSet and TreeMap implementations). A red-black tree is a two-fork search tree with the following coloring properties:

1. Each node is either black or red.

2. The root node is black.

3. The son node of a red node must be All black.

4. Each path from any node to a null must contain the same number of black nodes.

One of the conclusions of the above coloring rule is that the height of the red-black tree is the most, and this conclusion seems to be inferior to the classic balanced binary tree. Let's analyze the worst case of the red and black trees specified by the above coloring law.

By rule 4 , the height difference of the left and right subtree can be pulled apart as much as possible on one side of the subtree using the Red node as many as the other side, and as little or no red node as possible, as shown in Figure 3:

Figure 3

We can obviously get a conclusion: a red black tree containing k red nodes, theoretically the maximum height difference of the inner subtree can reach k.

Since the red and black trees are so flawed in theory, why do they adopt more in practical applications? In-depth study of the specific implementation of red and black trees, it can be found that the implementation of red and black trees in the actual application of the form has exceeded its original definition of the rules. The theory of red and black trees is not perfect, it is difficult to understand and novice difficult to achieve their own implementation of one of the reasons. (Note: This article incorporates the "Data Structures and algorithmanalysis in Java" Mark Allen Weiss The implementation of red-black trees in the book, which is also the most used in the actual implementation method).

Through my summary, the real red and black trees in the implementation process added the following restrictions:

1. The newly inserted node must be red.

2. The left and right subtree of any node has a maximum difference of 2 layers of red nodes.

3. The insertion process (only on the path of the insertion point) does not allow any node to have 2 red son nodes.

The red and black trees, which have added the above three restrictions, do not even need to be analyzed. Rules 6 and 7 are based on the complex insert adjustment of Rule 5 and red-black tree, rule 6 Happens to directly block the possibility of the worst case scenario of the classic balanced binary tree, rule 7 even for red-black trees to aa Trees (theAA Tree Implementation is not yet complete , the current poor performance, this article does not do in-depth discussion of a transition. To show the changing process of inserting the right node into the red-black tree sequentially (inserting 30,40,50,60,70,80,90,100in turn) (Figure 4, figure 5 ):


Figure 4
Figure 5

3. Performance Testing

The test environment for this article isWindow7operating system, code use allJavawritten,JDKversion is1.8.0, the test uses nine-digit numeric data generated by the random number generator. To eliminate the impact of compiler optimizations and operating system scheduling as much as possible, each test runs -averaging time, for example, for a million-level test: Build -different times. +a random nine-digit number, each resulting in a balanced binary tree, a red-black tree,AAThe tree is constructed, and the structure is completed and recycled.10000statistics of the average lookup time, and the amount of node data generated during the lookup is One-tenth (here +data to ensure that it is not found, it is also counted into the lookup time, which means that the million-level test is1001000the results obtained after the second. The final test results are as follows:

Balanced binary Tree:

Constructing a 10,000-node red-black tree takes up to nine milliseconds. The Million-level search time is 460 , the average tree height 1 is the layer;

Constructing a 100,000-node red-black tree takes a few milliseconds. The 100,000-level look-up time is 670 , the average tree height is the layer;

Constructing a 1 million-node red-black tree takes 1850 milliseconds. Millions find time is 840 nanoseconds, average tree height layer.

Red and black Trees:

Constructing a 10,000-node red-black tree takes a few milliseconds. The Million-level lookup time is 430 , the average tree height is the layer;

Constructing a 100,000-node red-black tree takes a time of nine milliseconds. The 100,000-level look-up time is N- nanosecond, and the average tree height is the layer;

Constructing a 1 million-node red-black tree takes a millisecond. Millions find time is 740 nanosecond, average tree height .

AA Tree:

Constructing a 10,000-node red-black tree takes a few milliseconds. The search time is 470 , the average tree height is the top layer;

Constructing a 100,000-node red-black tree takes a time of nine milliseconds. The 100,000-level look-up time is 650 , the average tree height is the layer;

Constructing a 1 million-node red-black tree takes 1900 milliseconds. The millions find time is in nanoseconds and the average tree is high .

Note: The performance of the AA tree is between the red and black trees and the balanced binary tree. However , the depth of the AA tree is highly fluctuating by randomly generated digital imagery (worst case occurrence and highest worst case tree).

This is summarized as follows:

Performance: Red black tree > Balanced binary tree >aa tree;

Programming difficulty : red black tree > Balanced binary tree >aa tree.

Although the implementation of a variety of trees and specific applications vary widely, but there is no best data structure only the most appropriate structures, the comparison of the three search trees in the actual application of the performance of only a small difference (the probability of the worst case is very low), in the production practice does not bring significant performance defects, Therefore, it is the most important choice factor to choose the reasonable realization way and ensure the function of the program.


Balanced binary tree and red black tree worst case performance analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.