Indexing technology in the in-memory database

Last Update:2017-08-04 Source: Internet

Author: User

Tags lowercase

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The traditional database management system puts all the data on disk for management, so it is called a disk database (drdb:disk-resident). The disk database requires frequent access to the disk for data manipulation, and the disk reads and writes much less than the CPU processes the data, so the disk database bottleneck occurs on disk read-write.

Based on this, the concept of a memory database is presented. The In-memory database (Mmdb:main, also called main memory database) [1] is a database management system that operates all or most of the data in memory, and re-designs the algorithm and data structure of query processing, concurrency control and recovery, To use CPU cycles and memory more efficiently. In contrast to disk, memory data reads and writes more than a few orders of magnitude, saving data in memory compared to accessing from disk can greatly improve the performance of the application.

In the last more than 10 years, memory development has been following Moore's Law [2], the price of memory has been declining, and memory capacity has been increasing. Now the mainstream server, hundreds of GB or a few terabytes of memory is very common, the development of memory allows the memory database to be implemented.

Because the memory database differs greatly from the traditional disk database design and architecture, traditional database indexes do not apply to the in-memory database. Researchers have done quite a lot of research and work to improve the index structure of the memory database. Among them, the larger index has the early T-tree, Cache-sensitive (cacheconscious)-based css/csb+ tree, Trie-tree and hash, and so on. This paper studies and analyzes these representative index algorithms, and lays a solid foundation for further improving the index algorithm of memory database and improving index performance.

2, t-tree2.1 T-tree

T-tree is an indexing technique optimized for main memory access [3]. T-tree is a balanced binary tree with multiple index entries in one node, and T-tree's index entries are much leaner than b-tree in terms of size or algorithm. T-tree's search algorithm does not search for the value in the current node or elsewhere in memory, each access to a new index node, the scope of the index is reduced by half.

node of Figure 2-1t-tree

The T-tree index is used to implement the keyword's range query. The t-tree is a specially balanced two-fork tree (AVL), which stores a set of keywords that are sorted by key values for each node. T-tree In addition to the high node space occupancy, the search algorithm for traversing a tree also occupies an advantage in complexity and execution time. Now T-tree has become the most important index in the memory database.

T-tree has the following characteristics: 1) The difference between the Zuozi and the right subtree is not more than 1, 2) in a storage node can hold multiple key values, its left and right key values are the node's minimum and maximum key value, its left subtree contains only those key values less than or equal to the minimum key value of a record, The right subtree only includes those records whose key value is greater than or equal to the maximum key value, 3) the node with the left and right subtree is called the inner node, the node with only one subtree is called the Half leaf node, the node without the subtree is called the Leaf, 4) in order to maintain the space utilization, Each internal node needs to contain a minimum number of key values. As a t-tree is a balanced binary tree with multiple keywords for each node, the key words are arranged in order, the left subtree is smaller than the root node key word, and the right subtree is larger than the root node keyword.

In the above T-tree node structure, the following information is included:

(1) Balance (balance factor), whose absolute value is not higher than 1,balance= right subtree height-Zuozi height;

(2) Left_child_ptr and Right_child_ptr respectively represent the Saozi right subtree pointer of the current node;

(3) Max_item represents the maximum number of key values that can be accommodated in a node;

(4) Key[0] to K[max_item-1] is the key word stored in the node;

(5) Nitem is the number of keywords that the current node actually stores.

For T-tree there are the following characteristics:

(1) Similar to the AVL tree, the height difference between the left and right subtree of any node in the T-tree is 1;

(2) Unlike the AVL tree, multiple key values can be stored in t-tree nodes, and these key values are arranged in an orderly manner;

(3) The key value contained in the left subtree of the T-tree node is not greater than the leftmost key value in the node, and the key value contained in the right subtree is not less than the right-most key value in the node;

(4) In order to ensure that each node has a high space occupancy rate, each internal node must contain a number of key values not less than a specified value, usually (max_item-2) (Max_item is the maximum key value in the node).

2.2 Operation of the T-tree index

Use T-tree as the index way to complete three main work: Find, insert, delete. Where insertions and deletions are based on lookups. The following sections describe the flow of three operations respectively.

2.2.1 Find

The T-tree lookup is similar to a binary tree, except that the comparison on each node is not the value of each element in the node, but rather the first check that the target key value to be found is contained within the range determined by the leftmost and most right-most key values of the current node, and if so, Search by using dichotomy in the list of key values for the current node, or, if the target key value is less than the leftmost key value of the current node, searches for the left child node of the current node similarly, and searches for the right child node of the current node similarly if the target key value is greater than the right-most key value for the current node.

2.2.2 Insert

The insertion of T-tree is based on find, the Apply find operation locates the target key value insertion location, and notes the last node that the lookup process encounters. If the lookup succeeds, determine if there is enough storage space in this node. If there is, the target key value is inserted into the node, otherwise the target key value is inserted into the node, then the leftmost key value in the node is inserted into its record (this is the recursive insert operation), then the new node is assigned, and the target key value is inserted, and then based on the relationship between the target key value and the maximum minimum key value of the node. The newly assigned node is linked to the left child or right child of the node, the tree is examined to determine if the equilibrium factor of the t-tree satisfies the condition, and if the balance factor is not satisfied, the rotation is performed.

2.2.3 Delete

The delete operation of T-tree is also based on lookup, which applies the find operation to locate the target key value. If the lookup fails, it ends, otherwise n is the node where the target key value is located, and the target key value is removed from node n, and if junction n is empty, the junction n is deleted and the balance factor of the tree is checked to determine if the rotation operation is required, and if the number of key values in node n is less than the minimum value, The balance factor of N determines whether the largest key is removed from the left subtree of the node n or the smallest value is removed from the right sub-tree.

Key technologies for 2.3 t-tree indexing

Implementing the T-tree index is to implement T-tree lookups, insertions, and deletions. It is also based on the search, which is the key to the maintenance of T-tree, which is t-tree rotation. T-tree rotation occurs when the tree is unbalanced due to insertion or deletion of key values. To bring it back to equilibrium.

In the case of an insert, it is necessary to sequentially check all the nodes along the new node to the root node path until one of the following two cases is aborted: two subtrees of a checked node are of equal height and no rotation is required; the difference in height of two subtrees of a checked node is greater than 1. Only one rotation can be performed on the node at this point.

In the case of deletion, it is similarly necessary to check all the nodes along the parent node to the root node path from the node to be deleted, and to perform a rotation when the difference between the left and right sub-tree heights of a node is found to be out of bounds during the inspection process. Unlike the insert operation, after performing the rotation operation, the check process cannot be aborted, but must be performed until the root node is checked.

As you can see, for an insert operation, a maximum of one rotation is required to restore the t-tree to a balanced state, while for a delete operation it can cause an upward ripple effect that rotates the high-level nodes, which may require multiple rotations.

To balance the T-tree, rotation is the most critical and difficult operation in T-tree, and the T-tree rotation technique is described below. Rotation can be divided into four cases: the rotation caused by the insertion (or deletion) of left child's left subtree is recorded as ll rotation, similar to LR,RR and RL rotation. The situation when inserting is similar to deleting.

3, css/csb+ Tree 3.1 css-trees3.1.1introduction

[4] Css-trees (cache-sensitivesearch trees) can provide faster query operations than binary lookups without the need for a lot of extra space. This technique stores a directory structure at the top of an ordered array, and the node size of the directory structure matches the size of the machine cache-line. The directory structure is stored in an array without the need to store pointers to the internal nodes, which can be positioned by an array offset, unlike b+-trees.

3.2 Full Css-tree

Construct a query tree with a node containing M key values, the depth of the tree is D, then the tree is a complete (m+1)-query tree, while the D-layer leaf nodes are distributed from left to right. An example tree of m=4 is shown in Figure 3-1, where the number of blocks is a node, and each node has four key values.

The nodes of the css-tree can be stored in the array, as shown in 3-2:

3.2.1 Construction Full Css-tree

Constructs an ordered array of the corresponding full css-tree, first dividing the array into two parts and establishing a match between the leaf nodes and the elements. Then, starting from the last internal node, the maximum key value of the node's direct left subtree is used as the node entry. For some internal nodes, that is, the ancestors of the deepest last leaf node, there may be full key values, which can be filled with the last element of the first half of the array, so there are some duplicate key values in some internal nodes. Although it is difficult to incrementally update a full css-tree tree, it does not cost much to construct such a tree. Experiments show that for arrays with 25 million-key values, it takes less than a second to construct their corresponding full css-tree.

3.2.2 Query Full Css-tree

Starting from the root node, querying an internal node each time, using a binary lookup to decide which branch to find, repeating the above behavior until the leaf node, and finally matching the leaf node with the ordered array.

All queries within the node are made up of if-else, and when the internal nodes are searched, the key value on the left is not less than the key value to be queried, and when the first key value to be queried is found, stop comparing and go to the right branch (if no such value is found, go to the leftmost branch). This ensures that when there are replicated values in the node, we can find the leftmost key value in all the copied key values.

3.3 Levelcss-tree

For full css-tree with M records for each node, there is a strict m key value, and all records will be exploited. For m=2t, we define only m-1 records per node, and there is a branching factor m. A level Css-tree tree is larger than the depth of a corresponding full css-tree tree, because the branching factor is m rather than m+1, and then for each node, fewer companion numbers are required. If n is the number of nodes corresponding to an ordered array element, level Css-tree has LOGMN layer, and full css-tree has logm+1n layer. The number of companions per node is T, and full css-tree is t* (1+2/(m+1)), so the total number of companions for level Css-tree is logmn*t=log2n, and full css-tree is logm+1n*t* (1+2/(m+1)) =log2n*logm+1m* (1+2 (m+1)). As a result, level css-tree requires less companion than full css-tree. On the other hand, level css-tree needs to logmn a cache accesses, traverse LOGMN nodes, and full css-tree needs to be logm+1n.

Building a level css-tree similar to full css-tree, we can also use the empty slots of each node to store the maximum value of the last branch to avoid traversing the entire subtrees tree to get the maximum element value. Query A level Css-tree is also similar to query full Css-tree, the only difference is the calculation of the sub-node offset.

3.4 csb+-tree3.4.1introduction

Although Css-tree performs better than binary lookup and t-trees queries, it is designed for decision support with relatively static data workloads. Csb+-tree (cachesensitive b+-trees) [4], is a variant of b+-trees, continuously stores the child nodes of a given node, and only stores the address of the first child node of the node, the address of the other child nodes can be calculated relative to the Sub-node offset calculation obtained. Because only a pointer to a child node is stored, the cache is highly utilized, similar to B+-tree, and Csb+-tree supports incremental updates.

There are two variants of csb+-tree, segmented Csb+-tree (segmentedcsb+-tree) and full Csb+-tree (fullcsb+-tree). Segment Csb+-tree segments the child nodes, continuously storing the child nodes in the same segment, In each node, only the starting address of each segment is stored. When there is fragmentation, segmented csb+-tree can reduce replication overhead because only one segment needs to be moved. Full Csb+-tree reallocate space for the entire node, thus reducing split overhead.

Operations on the 3.4.2csb+-tree

1) bulkload.

For the Csb+-tree tree, an effective Bulkload method is to build an index structure on a layer by level. Allocate space for each leaf node, calculate the number of nodes needed at the top level, and allocate contiguous storage space to the layer. By filling the upper node with the maximum value of each node in the lower layer, and setting the first child node pointer for each high-level node. Repeat until the upper level has only one node, and this node is the root node. Because all nodes on the same layer are contiguous, there is no need for additional copy operations to construct node groups.

2) Search

Query Csb+-tree is similar to query B+-tree, when the right-most node's key value K is smaller than the key value to be queried, the first child node is incremented by an offset of K to obtain the address of the child node. For example, K is the third key value of a node and can be found with a C statement: Child=first_child+3, where child and first_child are pointers to nodes.

3) Insertion

Csb+-tree insert operation is similar to B+-tree, first to find the key value of the insertion port, once positioned to the corresponding leaf node, determine whether the leaf node has enough space, if any, it is simple to place the key value in the leaf node, otherwise, you need to split the leaf node.

When splitting a leaf node, there are two scenarios based on whether the parent node has enough space to hold the key value. Suppose the parent node p has enough space, so that f is a pointer to the first child node of P, G is the node Group F points to, constructs a new node Group G ' With a node greater than G, and copies all the nodes in G to G ', The node to be split in G becomes two nodes in G ', update the pointer F of the first child node in P, make it point to G ', and reassign G.

The problem is more complicated when the parent node has no extra space and needs to split itself. So that f is a pointer to the first node in P, a new node Group G ' is constructed, the nodes in G are evenly divided into G ' and g, and half of the key values in P are transferred to G '. In order to divide p into p and P ', the node group containing p needs to replicate as in the first case, or, if the node group is also full, we need to recursively split the parent node of P. The parent node then repeats the above operation.

4) Deletion

The delete operation is similar to an insert operation, a generic, simple location data entry and is deleted. No need to adjust the tree to guarantee 50% occupancy[5]

3.4.3Segmented Csb+-tree

Considering a maximum of 30 key values per node in a 128-byte cache-line,csb+-tree, which means that each node can have 31 child nodes, a node group can be up to 31*128 nearly 4KB, so each split requires copying 4KB of data to create a node group, If the cache-line is larger, the cost of splitting a node will be greater.

Modifying the node structure can reduce replication operations at split time. Sub-nodes can be segmented, each section of the address is stored in the node, each section formed a node group, only the child nodes in the same segment are continuously stored. The first consideration is to fix the size of each segment, populate the node for the first segment, and once the first fragment is full, place the node in the second segment. If a node falls in the second segment, we simply copy the node of the second segment into the new segment without the first segment, and if the new node falls in the first segment (already full), we need to move the data from the first segment to the second segment, in the above example, for random insertions, Data duplication generated by splitting will be reduced to 1/2+3/4 *4kb=2.5kb. The other is to allow the size of each segment to be different, eventually dividing the node into two segments. When a node is inserted, a new segment is created for the segment to which the node belongs, and the size of the corresponding segment is updated. In this approach, strictly speaking, each insertion involves only one fragment (but when the parent node also needs to be split, at which point two fragments are to be duplicated), if a new node is likely to fall into one of the segments, a split produces a data replication amount of 1/2*4KB=2KB, which can further reduce the amount of data replication. There are two segments of the segmentedcsb+tree3-3 shown (only two key values per leaf node):

Segmented Csb+-tree can support all operations on the tree, similar to non-segmented csb+-tree, however, finding the right child for each node is more expensive than non-segmented csb+-tree because it is necessary to find the segment where the child is located.

3.4.4fullcsb+-tree

In Fullcsb+-tree, node splitting is less expensive than csb+-tree. In Csb+-tree, when a node splits, it needs to replicate the entire node group to the new group, while in Fullcsb+-tree, only half of the node group is accessed. There is a large intersection of the source and destination addresses for this transfer operation, and the number of cache-line accessed is limited to S. Fullcsb+-tree average time spent on splitting is 0.5s, while Csb+-tree takes 2s.

3.5 Time Space Analysis

Assume that the key value, the child node pointer, the tuple ID have the same spatial size k,n as the number of leaf nodes, C is the number of cache-line bytes, and T is the number of segments of the segmented csb+-tree. The slot value for each node is m, where m=c/k, assuming that the node size is the same as cache-line, each parameter and its corresponding value 3-4 are as follows:

Figure 3-5 shows a comparison of the branching factors, the number of key differences, the number of cache misses, and the other differences in each node among the various methods. The branching factor of b+-tree is smaller than that of Css-tree, while Csb+-tree stores fewer child nodes, and the required branching factor is similar to Css-tree. This results in a different number of cache misses per method. The larger the branching factor of a node, the smaller the cache misses. In Csb+-tree each additional segment, the branching factor is reduced by 2, due to the need for one slot to store the child node pointer and the other slot to store the size of the new segment. In general, the 70% space of the nodes in the B+-tree is full, and the branch factor size needs to be adjusted accordingly. [6]

Figure 3-5 Csb+-tree Query Time analysis

Figure 3-6 shows the number of cache-line expected to be accessed at split time. Because the source and destination addresses intersect at the time of replication, the number of fullcsb+-tree required is small. Split overhead is part of the total cost of the insert operation, and the other part is the query overhead generated by locating the optimal leaf node. Splitting overhead is relatively independent of the depth of the tree, because most divisions occur at the leaf node. However, when the size of the tree becomes larger, the cost of the corresponding query increases. The splitting overhead of csb+-tree is larger than that of B+-tree, but the total cost of insertion is also related to the size of the tree.

Figure 3-7 shows the spatial requirements of the different algorithms. Assume that all node 70% 's space is full [6], and calculate the spatial size of the inner and leaf nodes separately, assuming that each leaf node has 2 sibling node pointers. The internal node space size is equal to the leaf node space multiplied by 1/(Q-1) (q is the branch factor), here is not compared to css-tree, because Css-tree cannot be partially full.

4 Trie-tree Index 4.1 trie-tree

TRIE-TREE[7] Also known as the Word search tree or key tree , is a tree-shaped structure, is a hash tree variant. Typical applications are used to count and sort large numbers of strings (but not limited to strings), so they are often used by search engine systems for text frequency statistics. It has the advantage of minimizing unnecessary string comparisons and querying efficiencies over hash tables.

4.1.1 Trie-tree Property

It has three basic properties:

1) The root node does not contain characters, and each node except the root node contains only one character

2) from the root node to a node, the characters that pass through the path are concatenated, and the corresponding string is changed for the node.

3) All child nodes of each node contain different characters.

Figure 4-1 shows a basic tire-tree structure

Figure 4-1 Tire-tree

4.1.2 Trie the basic implementation of the tree

The insertion, deletion, and lookup of the letter tree are very simple, with a one-cycle, that is, the first I-cycle to find the sub-tree corresponding to the first letter, and then the corresponding operation. To implement this tree, we use the most common array to save (static memory) can, of course, also open the dynamic pointer type (dynamic open memory). There are generally three ways to point a knot to a son:

1) An array of alphabet-size is opened for each node, and the corresponding subscript is the letter represented by the son, and the content is the position of the son corresponding to the large array, that is, the label;

2) for each node to hang a linked list, according to a certain sequence of records of each son who;

3) use left son right brother notation to record this tree.

Three methods, each with its own characteristics. The first one is easy to realize, but the actual space requirement is large; the second one is easier to realize, the space requirement is relatively small, but it is more time-consuming; third, space requirements are minimal, but relatively time-consuming and difficult to write.

4.1.2.1 Implementation Method

The method for searching the dictionary [8] project is:

1) Start a search from the root node.

2) Obtain the first letter to find the keyword, and select the corresponding subtree according to the letter and go to the subtree to continue the search;

3) on the corresponding subtree, obtain the second letter to find the keyword, and further select the corresponding subtree to retrieve.

4) Iterative process ...

5) at a node where all the letters of the keyword have been removed, the information attached to that node is read to complete the lookup.

Other operations like handling

4.1.2.2 trie principle

The core idea of trie is space change time. Use the common prefix of a string to reduce the cost of query time to achieve the purpose of increasing efficiency.

4.1.3 Trie high-level implementations of treesDouble-array implementation

Can be implemented using the even-numbered group (Double-array), 1.3. The use of even groups can greatly reduce the amount of memory used, the specific implementation

Two arrays, one is base[], the other is check[]. Set the array subscript to I, if base[i], check[i] is 0, indicating that the location is empty. If base[i] is negative, the state is signaled (that is, the word). Check [I] represents the previous state of the state.

definition 1. For the input character C, transfer from state s to state T, the even-numbered group dictionary tree satisfies the following conditions (Figure 4-2):

Check[base[s] + c] = s
Base[s] + c = t

From Definition 1, we can get a lookup algorithm for the given State s and the input character C :

T: = Base[s] + C;
If check[t] = s Then
Next state: = t
Else
Fail
endif

We know that even-numbered groups are implemented by allocating space to a new state when there is a new transition in the state, or by allocating only the space in which the state needs to be transferred. When the above conditions cannot be met and then adjusted so that its base value satisfies the above conditions, this adjustment only affects the redistribution of the next layer of nodes in the current node, because the address assignment of all nodes is determined by the starting subscript specified by the base array. The inserted operation assumes that the base value starting with a character is I, the character sequence code of the second character is C1, C2, c3...cn, then it must satisfy base[i+c1], check[I+C1], Base[I+C2], check[i+c2], base[i+c3], check[I+C3] ... Base [I+CN],check[I+CN] is 0.

Figure 4-3 Double Array Implementation

Assuming that there are n nodes in the tire, the character set size is m, then the Datrie space is n+cm, andC is a factor that relies on the trie sparsity degree. The spatial size of the multi-path lookup tree is nm.
Note that the complexity here is calculated according to the offline algorithm (offline algorithm), that is, the entire thesaurus has been processed. The spatial complexity of online algorithm is also related to the order in which words appear, and the more orderly the order of the words the smaller the space occupies.
Finding the complexity of the algorithm is related to the length of the string being looked up, and the complexity is the same as the Multipath lookup tree.
In the insertion algorithm, if redistribution occurs, we attach the time complexity of the scanned child nodes and the algorithm complexity determined by the new base value. If we are all using the Brute force algorithm (for loop scanning), then the insertion algorithm time complexity is O (nm + cm2).

The actual coding process, Datrie code difficult to find a tree, mainly the state of the expression is not as clear as the tree structure, subscript easily confused.
One place to note is that thebase positive number represents the starting offset, and a negative number indicates that the state is signaled, so when looking for a new base value, ensure that the value found is positive.
such as: Empty trie state, when inserting D, because the first empty address is 1, so get base=1-4=-3, so that the meaning of the base plus or minus is destroyed.

4.1.4 Trie application of the tree

Trie is a very simple and efficient data structure, but there are a lot of application examples.

(1) String retrieval

Save the information about some known strings (dictionaries) in advance to the trie tree, and find out if other unknown strings have occurred or are occurring frequently. For example:

1) Give a cooked word list of n words, and an article full of lowercase English, please write all the new words in the order of first appearance.

2) give a dictionary in which the words are bad words. The words are all lowercase letters. Given a text, each line of text is also made up of lowercase letters. Determine if the text contains any bad words. For example, if Rob is a bad word, then the text problem contains bad words.

(2) Longest common prefix of the string

The trie tree uses the common prefixes of multiple strings to conserve storage space, whereas when we store a large number of strings on a trie tree, we can quickly get a common prefix for some strings.

For example: give the N lowercase English alphabet string, and Q query, that is, ask the length of the longest common prefix of the two strings?

Solution: First set up its corresponding letter tree for all the strings. At this point, the length of the longest public prefix for the two strings is the number of common ancestors of their nodes, so the problem is converted to the problem of the nearest public ancestor (Leastcommon Ancestor, or LCA) of the offline (Offline).

The recent public ancestor problem is also a classic problem, which can be used in the following ways:

1) using the disjoint set, the classical Tarjan algorithm can be adopted.

2) after finding the Euler sequence (Euler Sequence) of the letter tree, we can switch to the classic minimum value query (Range Minimum query, short RMQ).

(3) Sort

Trie Tree is a multi-fork tree, as long as the first sequence to traverse the whole tree, the output of the corresponding string is a dictionary ordered by the result. For example: give you n a different English name consisting of only one word, and let you sort them out in dictionary order from small to large.

(4) As an auxiliary structure for other data structures and algorithms, such as suffix tree, ac automaton, etc.

4.1.5 Trie Analysis of tree complexity

(1) The time complexity of inserting and finding is O (n), where N is the length of the string.

(2) Space complexity is 26^n level, very large (can be improved by using even-numbered groups).

4.1.6 Summary

Trie tree is a very important data structure, it is widely used in information retrieval, string matching and other fields, and it is also the foundation of many algorithms and complex data structures, such as suffix tree, ac automaton, etc.

4.2 triememory

Trie Memory[9] is a way to store and retrieve information in memory, which has the advantage of fast access and the advantage of redundant storage information, the main disadvantage is that storage space utilization is very low.

4.2.1 the basic trie memory model

Suppose we need to keep track of a series of word collections, which are sequences of letters. These word sequences have a variety of lengths, and we must remember that these letters consist of a finite sequence in this set. In all, we need to judge whether a sequence is a member of this collection.
At first Trie was just a collection of registers, and there were two registers, one α and the other δ, and each register had a cell to store the entire alphabet, and if we were to store "space", Each register must have a 27 cell.
Each cell has space to store other registers ' in-memory addresses, and the cell in Trie has not been used to store information, usually containing Registerα address information. If a cell contains a non-registerα register address, it means that the information is stored, which represents the cell name, "A" represents a cell, and "B" represents B cell. The address of the next register is in the sequence.
Here is an example (Figure 2.1) to illustrate that we use the first 5 characters of the alphabet to represent the whole in order to make the example easier. Then use the "space" in the "," Suppose we want to store Dab,bad,bade,be,bed,bead,cab,cad and a, and then use the diagram to illustrate the whole process. Each row in the diagram represents a register, each register has 6 cells, and the last line represents the third special register called the Portal Register, which is the channel into which we enter the system memory. It is the same as the other registers, except for the entrance. The other registers are numbered. Registerα will select them. At first, Registerα was register 2.

Figure 4-4 Basic Tire memory model

To store the dabs, we introduced the address "2" into the D cell of the portal register, then we moved to register 2 and then introduced the address "3" to the A cell, then we entered Register 3 and put the address "4" into cell B, Finally we move to register 4 and put the address "1" into the "unit", which is the terminating parameter, at which point the dabs are stored. Then we go to the second word bad, introduce the address "5" into the Portal Register B unit Glyd represents the letter B, and then to register 5 of a cell write address "6", then register 6 d cell write address "7", and finally to register 7 of the In-cell write address "1". When we began to store bade, we found that B,a,d was already in trie, so we followed the existing bad path to register 7 and then introduced the address "8" to cell E, then put the address "1" into the Register 8.

Types of 4.2.2 Register

In the structure we have just mentioned, we can divide the register into 4 categories:

1) Alpha Register to point to the next location where information is stored

2) δ (deletion) register

3) ν (next) register, next the information to be stored (in empty memory, it is the portal register)

4) χ (exterior) type χ is the register in all registers that has not yet received storage information and is not pointed to the next storage location.

5) ο (occupied) type ο is a register of information

4.2.3 Trie the Read and write

In all of the registers above, in addition to the χ in the trie, storage and read operations can now be simply and fairly defined as follows.

4.2.3.1 Write Operations

1) Pass the I parameter character to the next register, and if it is the first character, it will be a portal register

2) Select the cell of the corresponding string, if the first parameter character is the first J character of the alphabet, select the J cell.

3) Detection of junctions from unit I

4) If this connection makes the Registerα:

A) The connection is projected through the αregister to the linked head so that the information can be stored.

b) Project a link from Αregister to the head of the new Lisboa Casino to create a "next" register (ν)

c) Finally, all the connections from the ν point to the Αregister.

5) If the junction from the J cell points to a non-αregister, move to that register:

A) If it is the first register, this parameter is a member of the storage collection (end process).

b) If it is not register 1, I plus 1 and go to the second step.

4.2.3.2 Read Operation

Use the same process, but do not use projections, do not project any relationships, if the junction points to register 1, then this parameter is a member of the storage collection, if the junction of any point points to αregister, in other words, this parameter is not a member of the storage collection.

5HASH Index

Hash is the keyword directly mapped to a storage address, to achieve the purpose of fast addressing, that is, addr=h (key), where key is the keyword; H is a hash function. There are several commonly used hash functions:

1) In addition to the remainder method (Divisionmethod), H (key) =keymod p,p is generally prime;

2) Random number method (Randommethod), H (key) =random (key), random function;

3) The square takes the middle method (Midsquaremethod).

The hash index structure does not require additional storage space and is able to accurately locate the data found in the time Complexity of O (1), minimizing the cost of data lookup time in the disk database. Hash index structure is widely used in disk database because of the above advantages. After a long period of research, we developed a link bucket hash (chainedbucket hash) [10], an extensible hash (extendible hash) [11], a linear hash (Linearhash) [12], and a modified linear hash (modified Linear hash) [13]. However, although these hashing algorithms are a little bit optimized for the memory database, the hashing algorithms used in traditional databases are not significantly different. By the year 2007, Kennetha. Ross proposes a hash prefetching algorithm based on modern processors [14], which incorporates the SIMD instruction set into the hash algorithm to improve the efficiency of the data organization by improving the hashing algorithm from the point of memory index.

5.1 Link Bucket Hash

The link bucket hash (Figure 5-1) is a static structure that can be used in memory with disk. Because it's a static structure and doesn't have to be re-organized, it's fast. But this is also its flaw, the face of dynamic data, it is inappropriate, because the link bucket hash must know the size of the hash table before use, and this is very difficult to predict. If the predicted table size is too small, its performance will be greatly affected, if too large, space waste is more serious. In the best case, it's only a waste of space to hold a pointer to the next bucket.

5.2 Extensible Hash

The extensible hash (Figure 5-2) introduces the concept of a catalog file, using a dynamic hash table that grows with the data, thus overcoming the flaw in the hash of the linked bucket, whose hash table size does not need to be known beforehand, that a hash node contains multiple items and splits it into two nodes when the number of nodes overflows. The table of contents increases exponentially by 2, and doubles when 100 Lego nodes are filled and reached a specific directory size. The hash function computes a K-bit binary sequence for each key, and the number of buckets always uses a number of digits from the first or last digit of the sequence []. One problem with extensible hashing, however, is that any one node will cause a split of the directory, and when the hash function is not random, the directory is likely to grow very large.

5.3 Linear Hash

A linear hash (figure 5-3) also uses a dynamic hash table, but differs greatly from the extensible hash. The number of linear hash selection buckets always keeps the average record of the storage block at a fixed scale from the capacity. And the hash bucket is not always able to split, allowing for overflow blocks. When the inserted record does not have a corresponding bucket, change its first hash value to 0, insert it again, or insert it directly into the bucket or its overflow block. When the number of records reaches a threshold than the capacity, an additional bucket is allocated. The linear hash grows slowly relative to the extensible hash, and the number and cost of re-organization are small. At the same time, a linear hash does not require a specialized directory entry for the data bucket pointer, and it is more natural to handle the case where the bucket is full, allowing for a more flexible choice of bucket splits.

5.4 Fixed linear hash

The modified linear OTP is primarily oriented to the memory environment for linear hashing. By using larger contiguous nodes instead of directories, ordinary linear hashes waste space due to empty nodes. Moreover, unless there is a clever solution to the potential virtual memory mapping mechanism, the successive nodes will be copied to a larger block of memory each time the directory grows. The fixed linear hash takes the same directory as the extensible hash, except that the directory is linearly growing, that the node that is linked is a single item and the allocated memory is from a regular memory pool. This algorithm node splitting criterion is based on performance, for example, monitoring the average length of the hash chain than monitoring the storage utilization can be more direct control of the average search and update time [13].

5.5 Hash pre-fetching algorithm

The hash prefetch algorithm is intended for scenarios where the key and hash values are 32-bit, specifically optimizing the memory environment. This algorithm uses multiplicative hashing, which is very common, computationally efficient, and more importantly applies to vectors, and achieves the purpose of calculating multiple hash functions at once (14). For the modern processor SIMD architecture, the key value and the hash value together in a directive, to achieve the purpose of greatly reducing the number of instructions, so that each time the required data length is equal to L2 cacheline, greatly reducing the performance cost, in the memory environment, greatly improve the performance of the cache.

Reference documents:

[1] Garcia-molina H, Salem K. Main memorydatabase Systems:an overview[j]. Knowledge and Data Engineering, Ieeetransactions on, 1992, 4 (6): 509-516.

[2] Moore G E. Cramming more components ontointegrated circuits[j]. 1965.

[3] Lehman T J, Carey M J. A Study of Indexstructures for main memory database management systems[c]//conference on Verylarge Data Bases. 1986, 294.

[4] June Rao,kenneth A. ross:cacheconscious indexing for Decision-support in Main memory,vldb 1999:78-89

[5] Raghuramakrishnan. Database Management Systems. Mcgraw-hill, 1997.

[6] Andrewyao. On random 2-3 trees. Acta Informatica, 9:159{170, 1978.

[7] Black,paul E. (2009-11-16). "Trie". Dictionary of algorithms and Data structures. Nationalinstitute of standards and technology. Archived from the original on2010-05-19.

[8] Knuth,donald (1997). "6.3:digital searching". The Art of computerprogramming Volume 3:sorting and searching.addison-wesley.p.492.

[9] Fredkin E.tire memory[j]. Communication of theacm,1960,3 (9): 490-499.

[Ten] Knuth D. The Art of computerprogramming 1:fundamental algorithms 2:seminumerical algorithms 3:sortingand Searching[J]. 1968.

[One] Fagin R, Nievergelt J, Pippenger N, et al. Extendible hashing-a Fast access method for dynamic Files[j]. ACM Transactionson Database Systems (TODS), 1979, 4 (3): 315-344.

[Litwin] Linear hashing:a new tool forfile and table Addressing[c]//vldb. 1980, 80:1-3.

[] Lehman T J, Carey M J. A Study of Indexstructures for main memory database management systems[c]//conference on Verylarge Data Bases. 1986, 294.

Ross K A. Efficient Hash probes on MODERNPROCESSORS[C]//ICDE. 2007:1297-1301.

Indexing technology in the in-memory database

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More