Find algorithm (I) Order Lookup Binary Lookup Index lookup

Source: Internet
Author: User

Find

This article is to find the first part of the algorithm, including basic concepts, sequential lookups, binary lookups, and index lookups. For the contents of the hash table and the B-tree lookup, you should update it later.

Basic Concepts

Search, also known as retrieval, finds the data table on a computer by finding the first record (element) or all of the records that meet the criteria, based on the conditions given.

If no record is found that satisfies the condition, a specific value is returned, indicating that the lookup failed, and if the first record that satisfies the condition is discovered, it is usually required to return the stored location or record value of the record for further processing, and if you need to find all records that satisfy the condition, you can consider The process of finding the first record that satisfies the condition, first finding the first record that satisfies the condition in the entire interval, then finding the first record that satisfies the condition in the remaining interval, and so on until the remaining interval is empty.

The table that you are looking for is structured differently, and its lookup methods are generally different.

The search process is the process of keyword comparison, the number of comparisons is the time complexity of the corresponding algorithm, it is an important index to evaluate the merits of a search algorithm.

The time complexity of a lookup algorithm can also be expressed by averaging the length of the lookup (Average search lengths, ASL), which is the average number of comparisons in the case of a search success.

The average lookup length is calculated as: asl=∑pici

where n is the length of the lookup table, that is, the number of elements contained in the table, the probability that pi is to find the element I, CI is the number of comparisons required to find the element I.

If the probability of finding each element is the same, ci=i is found on a linear table with n elements in order to find the element whose key field value equals K, so the average lookup length is: asl= (n+1)/2

Sequential Table Lookup

The sequential table (sequential list) refers to the sequential storage structure of a set or linear table.

There are two main ways to find on a sequential table: the sequential lookup method and the binary lookup method.

Sequential Lookup

  Sequential lookup (sequential search) is also called linear lookup, starting at one end of the order table, and sequentially comparing the keyword of each element with the given value K, if the keyword of an element equals K, then the lookup succeeds, returns the element subscript, and if all the elements are still not found, Indicates that the lookup failed, returning a specific value, commonly used-1.

Advantages: The simplest, no need for the order of elements, inserting new elements is convenient.

Cons: Slow, average lookup length (n+1)/2, approximately half the length of the table.

Ways to improve efficiency: Sort by search probabilities from large to small. In the case of an unknown probability, each time an element is found, it is swapped with the precursor element, so that the element with the highest frequency is moved forward gradually.

Two-point search

  Binary lookup (binary search) is also known as binary lookup, to sub-search. A data table that is a binary lookup object must be an ordered table stored sequentially.

Binary lookup process: Ordered table a[0]~a[n-1], first take the midpoint element A[mid] keyword with the given value K to compare, if the equality is found successful; otherwise, if the k< A[mid].key, then the left child table to continue the binary search; if k> A[mid].key , the binary lookup is continued in the right child table, so that, after a comparison, the search space is reduced by half, so that it goes on until the lookup succeeds, or until the current search interval is empty (or the lower bound of the interval is greater than or equal to the upper bound).

The binary search process is recursive, and it is easy to write a non-recursive algorithm, only need to modify the area to find the upper and lower bounds.

The binary search process can be described by a binary tree, each root node in the tree corresponds to the midpoint element of the current search interval, and its Saozi right sub-tree is the Zoozi and right child table of the interval, which is usually referred to as the decision tree of the binary search. Because the binary lookup is performed on an ordered table, its corresponding decision tree is a binary search tree (sort tree).

When you look up an element with a keyword equal to K on an ordered table, it corresponds to a path from the root node to the unknown origin node in the decision tree, and the number of times the keyword is compared is equal to the number of nodes on that path, or the level of the unknown Origin node.

Advantage: The time complexity is O (logn), the search speed is fast.

Cons: An ordered table needs to be established, and insertions and deletions can be cumbersome. In addition, an ordered table that is only available for sequential storage does not apply to the ordered table of linked storage.

Index LookupThe concept of an index

Index Search, also known as hierarchical lookup.

For example, when looking up the dictionary, first in the Radicals table in the corresponding Gept table page number, and then in the Gept table according to the number of strokes in the corresponding body of the page number, and finally found in this page number unknown origin Chinese characters. Among them, the whole dictionary is the object that the index finds, the dictionary body is called the Main Table , the radical table and the Gept table are indexes that are established for the convenience of finding the main table, so it is called the Index table .

Gept table is the main table as the lookup object, so called Gept table is a first-level index, called the radical table is a two-level index, that is, the index of the first-level index.

In computers, index lookups are based on the index storage structure of a collection or linear table. The basic idea of index storage is that the main table is divided into several sub-tables according to a certain relationship, an index item is established for each child table, and all these index entries form an index table of the main table, and then the index table and each child table can be stored in sequential or chained way.

Each index entry in the index table typically contains three fields (at least the first two fields): The index Value Range (index), which is used to store the index values of the corresponding sub-table, the equivalent of the recorded keyword, and the starting position field (start) of the child table. The storage location used to store the first element of the corresponding child table , and the length of the child table, which is used to store the number of elements corresponding to the child table.

In the index store, if each index entry in the index table corresponds to more than one record, it is called a sparse index , and if each index entry uniquely corresponds to a record, it is called a dense index .

Index Lookup Algorithm

Index lookups are lookups that are made on the index table and the primary table.

First, based on the given index value K1, find the index on the index table that the index value equals K1, to determine the beginning and length of the corresponding child table in the primary table, and then according to the given keyword K2, in the corresponding sub-table to find the keyword equals K2 element.

The comparison of index lookups equals the number of comparisons in the algorithm to find the index table and the number of comparisons to find the corresponding child table. (The following discussion considers only the case with a first-level index).

Assuming the index table length is m and the corresponding child table length is s, the average lookup length for index lookups is:

ASL = (1+m)/2+ (1+s)/2 = 1+ (M + s)/2

Because the sum of all the child tables is equal to the primary table length n, the average lookup length is ASL =1+ (M + n/m)/2 if each child table is the same length, that is, s=n/m.

By mathematical knowledge, when m=n/m, the average length is the smallest, that is, asl= 1+n½.

Visible, index lookups are faster than sequential lookups, but are less than binary lookups. The time complexity of the main table is O (N½) under the premise of being divided into N½ sub-table.

Block Lookup

Block lookup (blocking search) is an index lookup. It requires a sequential (increment or decrement) between each child table (the child table, also known as a block) in the primary table.

For example, the maximum keyword in the preceding block must be less than the smallest keyword in the back block. But the order of the elements in each block can be arbitrary.

It also requires that the index value of each index entry in the index table be used to store the largest keyword in the corresponding block.

Index tables are ordered, and the index domains in the key fields and index tables in the primary table have the same data type, which is the type to which the keyword belongs.

Because the index table is ordered, the index table can be both sequential and binary lookup, and the records in each block are arranged arbitrarily, so only sequential lookups can be made within the block.

Find algorithm (I) Order Lookup Binary Lookup Index lookup

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.