Jumping table: A probabilistic substitute for a balanced tree

Last Update:2018-07-26 Source: Internet

Author: User

Tags constant data structures require

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A jump table is a data structure that can replace a balanced tree . The jumping table pursues a probabilistic balance rather than a strict balance. As a result, the insert and Delete operations of a jump table are much simpler and faster to perform than a balanced binary tree.

Binary trees can be used to implement abstract data structures such as dictionaries and ordered tables. In the case of random insertion of elements, a binary tree can be very well addressed. However, in the case of orderly insertion, the binary tree is degraded (linked list), the performance is very poor. If there is a way to deal with the insertion element randomly arranged, the binary tree approximate rate can run well. In most cases, insertions are performed online, so random permutations are not feasible. The balance tree adjusts the tree structure during operation to meet the equilibrium conditions, thus obtaining the desired performance.

Jumping table is a probabilistic and feasible alternative data structure of binary tree. The hop table is balanced by a random number generator . Although the worst-case scenario (worst-case) performance is poor, there is no input sequence that inevitably results in a worst-case scenario (this is similar to the split element (pivot point) randomly selected fast row ). The probability of the extreme imbalance of the jump table is very low (a dictionary of 250 elements, which is less than one out of 10,000 of the probability of a lookup taking 3 times times the desired time). The Jump table equilibrium probability follows the machine to insert a similar two-fork tree, the advantage is that the insertion order does not require randomization.

It is much simpler to achieve a probabilistic balance than to strictly control the balance. For many applications, a jump table is more natural than a balanced tree, and the algorithm is simpler. The simplicity of the skip table algorithm means that it is easier to implement and has a constant number of times performance gains compared to the balance tree and adaptive tree. Jumping tables are also more efficient in space. An average of 2 pointers per element (even lower) is required, and there is no need to have balance and priority data on each node. structure

When searching for a linked list, we need to traverse each node (Figure 1a). If the list is ordered, the even-numbered node saves a pointer to the next even-numbered node (Figure 1b), we only need to check the maximum (N/2) + 1 nodes (n is the list size). If a node with a multiple of ordinal 4 has a node that is 4 steps forward, then it is only necessary to check (N/4) + 2 times. If the node with the ordinal 2^i has a pointer to the forward 2^i step, then it needs to be checked log2 n times. This data structure can be used to do quick searches, but insertions and deletions are not feasible.

A node with a K-forward pointer becomes a K-tier node. If the 2^i node has a pointer to a forward 2^i step, the number of nodes per layer satisfies the following relationship: The 1th layer has 50% nodes, the 2nd layer has 25% nodes, the 3rd layer has 12.5% nodes, and so on. Suppose the proportions of each layer are the same, but what happens when the nodes are randomly selected (Figure 1e). Node I advance pointer does not strictly jump 2^i step, but can jump any step. Because there is no need to maintain special conditions, the number of insert node layers randomly generated, insert and delete only need to make local modifications . In extreme cases, some levels of distribution can lead to very poor performance, but we will see that this is rare in the next step. This data structure adds an extra pointer to the list to skip some intermediate nodes, so it is named skip Table . algorithm

This section describes the algorithms for searching , inserting , and deleting . The search operation returns the value associated with the given key (key), which fails when the key does not exist. The Insert Operation associates the given key to the new value and inserts a new node if the key does not exist. The delete operation deletes the given key. In addition, operations such as the minimum and next keys are very simple to implement.

Each element is represented by a node, and the hierarchy is randomly selected by the node at insert time, regardless of the existing element. The nodes of level I have I forward pointers, and the subscripts are 1 to I, respectively. Nodes do not require the number of storage tiers. Select a suitable constant Maxlevel, the number of layers within this range. The maximum number of layers of the current node when the number of layers of the table is skipped, or when the skip table is empty, the number of layers is 1. Stores a forward pointer from level 1 to maxlevel with a single head vector. The portion of the pointer above the current number of hops directly points to nil. Initialize

Contract nil element whose key is greater than all legal (upper limit). Any layer of the hop table ends with nil. The new hop table is initialized to a layer of only 1, and all the forward pointers of all the headers point to nil. Find

When you look for an element, you need to traverse through the nodes of all keys that do not exceed the given key. If the current layer's forward node is no longer eligible, the next layer begins to traverse. When the traversal goes to layer 1th, the next node is the target node (if present).

Search (list, Searchkey)
    x: = List->header for

    i: = List->level downto 1 does while
        x->forward[i]-> Key < Searchkey do
            x = x->forward[i]

    x: = x->forward[1]

    if X->key = Searchkey
        Then return X->value
    Else
        return failure

Insert/Delete

To insert or Delete a node, simply perform a search operation (Figure 3) and then re-stitch as appropriate. The pseudo-code looks like this:

Insert (list, Searchkey, newvalue)
    local update[1..maxlevel]
    x: = List-header for

    i: = List->level downto 1 Do and
        X->forward[i]->key < Searchkey do
            x: = X->forward[i]
        update[i]: = x

    x: = X->for Ward[i]

    If X->key = Searchkey then
        x->value: = newvalue
    else
        lvl: = Randomlevel ()
        if lvl  > List->level Then for
            i: = list->level+1 to LVL do
                update[i]: = List->header
            list->level = Lvl
        x: = Makenode (LVL, Searchkey, value) for
        i: = 1 to lvl do
            x->forward[i] = update[i]->forward[i]< C18/>update[i]->forward[i]: = X

Figure 3 shows the search process. Notice that a vector called update is maintained during the search, and is updated every time the drop-down is searched. When the search is complete, update just records the closest node to the left of each layer at the operating position (in the picture ring):

Elements	node
UPDATE[1]	12
UPDATE[2]	9
UPDATE[3]	6
UPDATE[4]	6

If you create a layer that is larger than the current maximum layer when inserting, you need to update the number of hop layers and initialize the corresponding portion of the update vector.

Next, look at the pseudo-code for the delete operation:

Delete (list, searchkey)
    local update[1..maxlevel]
    x: = List-header for

    i: = list->level downto 1
        do While X->forward[i]->key < Searchkey do
            x: = X->forward[i]
        update[i]: = x

    x: = x->forward[i]< C7/>if X->key < Searchkey then for
        i: = 1 to List->level do
            if update[i]->forward[i]! = X Then break
  update[i]->forward[i] = x->forward[i] free

        (x) while

        list->level > 1 and list->header- >forward[list->level] = NIL do
            list->level: = List->level-1

Each time you delete, you need to check if the deleted node is the maximum layer node. If yes, you need to adjust the number of jump surface. Random function

Next, we need to determine a random number generation function, and its probability distribution makes the layer I have 50% nodes in the same data layer i+1. First, we are discussing a fractional p, which has a i+1 layer pointer for the p part of the node with the I-level pointer. The following is a very ideal random number generation function, with random layer generation independent of the element and scale of the jump table:

randomlevel () LVL: = 1 while random () < p and lvl < maxlevel do lvl: = LVL + 1 return LVL

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More