Find-Data Structures
several search algorithms: sequential lookups, binary lookups, block lookups, hash lists
First, the basic idea of sequential search:
From one end of the table, to the other end by the given value of KX and the key code to compare, if found, find the success, and give the position of the data element in the table, if the entire table is detected, the same key code is still not found with the KX, the lookup fails, giving the failure information.
Plainly, from beginning to end, one by one, find the same success, can not find the failure. The obvious disadvantage is that the search efficiency is low.
"Applicability": sequential storage structure and chained storage structure for linear tables.
Average lookup length = (n+1)/2.
"Order Lookup Pros and cons":
disadvantage: when n is large, the average search length is large and the efficiency is low;
Advantage: There is no requirement to store data elements in a table. In addition, for linear lists, only sequential lookups are possible.
Second, the binary of the orderly table to find the basic idea:
In an ordered table, the intermediate element is used as the comparison object, if the given value is equal to the key code of the intermediate element, the search succeeds if the given value is less than the key code of the intermediate element, and the left half of the middle element is searched, and if the given value is greater than the key code of the middle element, the search continues in the The lookup process is repeated repeatedly until the lookup succeeds, or the area you are looking for has no data elements, and the lookup fails.
Steps
①low=1;high=length;//Set initial interval
② when Low>high returns a lookup failure message//table empty, lookup failed
③low≤high,mid= (Low+high)/2; Determine the midpoint position of the interval
A. If Kx<tbl.elem[mid].key,high = mid-1; turn ②//find in the left half of the area
B. If kx>tbl.elem[mid].key,low = mid+1; turn ②//find in the right half of the area
C. If kx=tbl.elem[mid].key, returns the data element in the table position//Find success
The ordered table is arranged by key code as follows:
7,14,18,21,23,29,31,35,38,42,46,49,52
Find the data element with key code 14 in the table:
"Algorithm Implementation"
[CPP]View PlainCopyprint?
- int Binary_search (Elemtype a[], elemtype kx, int length)
- {
- int Mid,low,high, flag = 0;
- Low = 0; High = length; /*① Set Initial interval * /
- While (Low <= high) /*② table Empty Test * /
- {/ * non-empty, compare Test * /
- Mid = (low + high)/2; /*③ Get midpoint * /
- if (KX < A[MID]) high = mid-1; / * Adjust to the left half of the area * /
- Else if (kx > A[mid]) low = mid+1; / * Adjust to the right half of the area * /
- else {/ * lookup succeeded, element position set to flag * /
- Flag=mid;
- Break ;
- }
- }
- return flag;
- }
"Performance Analysis"
Average lookup length =log2 (n+1)-1
From the binary lookup process, the midpoint of the table is the comparison object, and the middle point table is divided into two sub-tables to continue this operation on the child table that is anchored to. Therefore, the lookup process for each data element in the table can be described using a two-fork tree, called the two-tree that describes the lookup process as the decision tree.
(7,14,18,21,23,29,31,35,38,42,46,49,52) Decision tree for binary lookup
As you can see, the process of looking up any element in a table is the number of times the key code of each node in the decision tree is from the root to that element's node path, that is, the level of the element node in the tree. For the decision Tree of N nodes, the tree height is k, then there is 2k-1 -1<n≤2k-1, namely K-1<LOG2 (n+1) ≤k, so k=. Therefore, the binary lookup succeeds in finding a key code comparison number of times.
Next, we discuss the average lookup length for binary lookups. For ease of discussion, take the example of a tree of two trees (n=2k-1) with a height of K. Assuming that the lookup of each element in the table is equal probability, that is, pi=, then the first layer of the tree has 2i-1 nodes, so the average lookup length for binary lookups is:
Therefore, the time efficiency of binary lookup is O (log2n).
Note:
Although binary lookups are efficient, you want to sort the tables by keyword. And the sort itself is a time-consuming operation, so the dichotomy method is suitable for sequential storage structures. To maintain the order of the table, a large number of nodes must be moved in the sequential structure for insertions and deletions. As a result, binary lookups are especially useful for linear tables that are rarely modified and often need to be found once created.
Third, the basic idea of Block lookup (index Lookup):
Block lookup, also known as index order lookup, is an improvement on sequential lookups. Block lookup requires that the lookup table be divided into several sub-tables, and the table is indexed, and each sub-table of the lookup table is determined by the index entries in the Index table. The index entry consists of two fields: the key field (which holds the maximum key value in the corresponding child table), the Pointer field (which holds a pointer to the corresponding child table), and requires that the index entry be ordered by key Code field. When searching, first use the given value KX to detect the index entry in the Index table to determine the lookup block to be made in the lookup table (because the index entry is ordered by key Code field, available in order to find or binary), and then the chunk is searched sequentially.
If the key code set is:
(22,12,13,9,20,33,42,44,38,24,48,60,58,74,49,86,53)
By key code value 31,62,88 is divided into three blocks created by the lookup table and its index table as follows:
Set table total n nodes, divided into B block, s=n/b
(Block Lookup Index Table) average lookup length =log2 (n/s+1) +S/2
(Sequential Lookup Index Table) Average lookup length = (s2+2s+n)/(2S)
Note:
The advantage of block lookups is that when you insert or delete a record in a table, you can insert or delete operations in the block (because they are unordered in the block, so you do not need to move a lot of records) as long as you find the block that the record belongs to. The main cost is to add a storage control for an auxiliary array and to sort the initial table tiles.
Its performance is between sequential lookups and binary lookups.
Find-Data structures