Several search algorithms: sequential search, half-lookup, segmented search, and hash
I. Basic Idea of sequential search:
Start from one end of the table and compare the given value kx with the key code one by one at the other end. If the key code is found, the query is successful and the position of the data element in the table is given; if the same key code as kx is not found after the entire table is detected, the search fails and the failure information is provided.
To put it bluntly, from the start to the end, compare one by one. If you find the same one, it will succeed. If you cannot find the same one, it will fail. The obvious disadvantage is that the search efficiency is low.
[Applicability ]:It is applicable to the sequential and chained storage structures of linear tables.
Average search length = (n + 1)/2.
[Sequential search advantages and disadvantages ]:
Disadvantages:When n is large, the average search length is large and the efficiency is low;
Advantages:It is not required to store data elements in a table. In addition, linear linked lists can only be searched in sequence.
Ii. Basic Idea of semi-query of ordered tables:
In an ordered table, take the intermediate element as the comparison object. If the given value is the same as the key code of the intermediate element, the query is successful. If the given value is smaller than the key code of the intermediate element, search continues in the left half of the intermediate element. If the given value is greater than the key code of the intermediate element, search continues in the right half of the intermediate element. Repeat the search process until the search is successful or the search area does not have any data elements.
[STEP]
① Low = 1; high = length; // set the initial interval
② When low> high, the query Failure Information is returned // The table is empty and the search fails.
③ Low ≤ high, mid = (low + high)/2; // determine the midpoint of the Interval
A. If kx <tbl. elem [mid]. key, high = mid-1; turn ② // search in the left half
B. If kx> tbl. elem [mid]. key, low = mid + 1; turn ② // search in the right half Area
C. If kx = tbl. elem [mid]. key, the position of the data element in the table is returned. // the query is successful.
Ordered tables are arranged by key codes as follows:
, 52
Find the data element with the key code 14 In the table:
[Algorithm Implementation]
Int Binary_Search (ElemType a [], ElemType kx, int length) {int mid, low, high, flag = 0; low = 0; high = length; /* ① set the initial interval */while (low <= high)/* ② empty table test */{/* not empty, perform a comparison test */mid = (low + high)/2;/* ③ obtain the midpoint */if (kx <a [mid]) high = mid-1; /* adjusted to the left half zone */else if (kx> a [mid]) low = mid + 1;/* adjusted to the right half zone */else, set the element position to flag */flag = mid; break;} return flag ;}
[Performance Analysis]
Average search length = Log2 (n + 1)-1
From the semi-Query Process, the comparison object is the midpoint of the table, and the child table is divided into two sub-tables by the center. This operation continues on the located sub-table. Therefore, the search process for each data element in the table can be described by a binary tree, which is called the binary tree that describes the search process as the decision tree.
(, 21, 35, 42, 52) Decision tree for half-Lookup
You can see that the process of searching any element in the table is to determine the number of times the key codes of each node in the tree are compared from the root to the path of the element node, that is, the number of layers of the element node in the tree. For the decision tree of N nodes, the tree height is K, there are 2k-1-1 <n ≤ 2k-1, that is, the K-1 <log2 (n + 1) ≤ k, So k =. Therefore, when the half-fold search is successful, most of the key codes are compared.
Next we will discuss the average length of half-lookup. For the convenience of discussion, take the full binary tree (n = 2k-1) with K as an example. Assume that the search for each element in the table is of equal probability, that is, Pi =, the I layer of the tree has 2i-1 nodes. Therefore, the average length of the half-query is:
Therefore, the time efficiency of semi-query is O (log2n ).
Note:
Although the half-query efficiency is high, you need to sort the table by keywords. Sorting itself is a very time-consuming operation, so the bipartite method is more suitable for sequential storage structures. To keep the table in order, a large number of nodes must be moved in the sequence structure. Therefore, semi-query is especially suitable for linear tables that are rarely changed once created and frequently needed to be searched.
III. Basic Idea of block search (index search:
Block lookup, also known as index sequential lookup, is an improvement in sequential lookup. Block search requires that the query table be divided into several sub-tables, and an index table is created for the sub-tables. Each sub-table of the query table is determined by the index items in the index table. The index includes two fields: key code segment (storing the maximum key value in the corresponding sub-table) and pointer field (storing the pointer pointing to the corresponding sub-table ), the index entries must be sorted according to the key code segments. When searching, kx is used to check the index items in the index table to determine the search blocks in the search table. (because the index items are sorted according to the key code segment, available sequential search or semi-query)
And then perform sequential search for the part.
For example, the key code set is:
(, 53)
The search table and its index table created by key code value and 88 are as follows:
Set n nodes in the table, which are divided into B blocks, s = n/B
(Partition search index table) average search length = Log2 (n/s + 1) + s/2
(Sequential search index table) average search length = (S2 + 2 S + n)/(2 S)
Note:
The advantage of block search is that when a record is inserted or deleted in a table, you only need to find the block to which the record belongs, and then insert or delete the record in the block (because the block is unordered, so there is no need to move a large number of records ). The primary cost is the addition of an auxiliary Array Storage control and the operation of sorting the initial table in blocks.
Its performance is between sequential search and binary search.