Search Algorithm by hash

Source: Internet
Author: User

Hash search is a method of searching by calculating the storage address of data elements. O (1) Search, that is, the so-called second kill. The essence of hash search is to map data to its hash value first. The core of hash search is to construct a hash function, which maps the original intuitive and neat data to seemingly random integers.

To perform a hash query, follow these steps:

1) construct a hash table using the given hash function;

2) address conflicts are resolved based on the selected conflict handling method;

3) Perform Hash search based on the hash table.

Follow these steps to create a hash table:

1) Step 1
Calculate the hash function value (address) of the key of the data element ). If the bucket corresponding to the address is not occupied, the element is saved; otherwise, step 2 is executed to resolve the conflict.

2) Step 2
Calculate the next storage address of the key based on the selected conflict processing method. If the next storage address is still in use, continue to step 2 until the available storage address is found.

The hash search procedure is as follows:

1) Step 1
For the given K value, calculate the hash address di = H (k); If the HST is empty, the query fails; if the HST is K, the query is successful; otherwise, execute Step 2 to handle the conflict ).

2) Step 2
Repeat the next storage address Dk = R (Dk-1) for processing conflicts until HST [dk] is empty or HST [dk] = K. If HST [dk] is K, the search is successful. Otherwise, the search fails.

For example, "5" is a number to be saved, and I threw it to the hash function. The hash function returns a "2" to me ", at this time, "5" and "2" will establish a corresponding relationship, which is the so-called "hash relationship". In actual application, "2" is the key, "5" is value.

Some friends will ask how to hash. First, two principles must be followed for hash:

①: The keys are scattered as much as possible, that is, if I lose "6" and "5" to you and you all return a "2", then such hash functions are not perfect.

②: The hash function is as simple as possible. That is to say, if you lose "6" to you, it will take one hour for your hash function to be provided to me. This is also not good.

In fact, there are "five" common hash methods ":

It is easy to understand that key = value + C; this "C" is a constant. Value + C is actually a simple hash function.

Type 2: "Division remainder method ".

It is easy to understand that key = Value % C; is interpreted as the same as above.

Third: "Digital Analysis ".

This is very interesting. For example, there is a group of value1 = 112233, value2 = 112633, value3 = 119033,

For such a number, we analyze the fluctuation of the two numbers in the middle, while the other numbers remain unchanged. The value of the key can be

Key1 = 22, key2 = 26, key3 = 90.

Category 4: China and France ". Ignore it here. See the name recognition.

Category 5: "folding method ".

This is very interesting. For example, if the value is 135790, The key must be a two-digit hash value. Then we will change the value to 13 + 57 + 90 = 160, and then remove the high "1". In this case, key = 60. Haha, this is their hash relationship, in this way, the key is related to every bit of value, so that the "hash address" is as scattered as possible.

An important factor affecting hash search efficiency is the hash function itself. Conflicts occur when the hash values of two different data elements are the same. To reduce the possibility of conflict, the hash function should map data to every table item in the hash table as much as possible.

There are two ways to resolve the conflict:

If the hash values of the two data elements are the same, select another table item for the data element inserted later in the hash table. When the program finds the hash table, if the data element that meets the search requirements is not found in the first corresponding hash table item, the program will continue to look for it later, until you find a data element that meets the search requirements or an empty table item.

Store data elements with the same hash value in a linked list. When you find a hash table, you must use a linear search method.

The implementation of the hash function is the division remainder method, and the solution conflict is the open address linear probing method. The Code is as follows:

`Public class hashsearch {public static void main (string [] ARGs) {// "Division remainder method" int hashlength = 13; int [] array = {13, 29, 27, 28, 26, 30, 38}; // hash table length int [] hash = new int [hashlength]; // create hash for (INT I = 0; I <array. length; I ++) {inserthash (hash, hashlength, array [I]);} int result = searchhash (hash, hashlength, 29); If (result! =-1) system. out. println ("already found in the array, index location:" + result); elsesystem. out. println ("this is not the original ");} /***** hash table data retrieval ** @ Param hash * @ Param hashlength * @ Param key * @ return */public static int searchhash (INT [] hash, int hashlength, int key) {// the hash function int hashaddress = Key % hashlength; // specifies that the corresponding hashadrress value exists but is not a key value, the while (hash [hashaddress]! = 0 & hash [hashaddress]! = Key) {hashaddress = (++ hashaddress) % hashlength;} // an open unit is found, indicating that the search fails if (hash [hashaddress] = 0) Return-1; return hashaddress;}/***** insert data into the hash table ** @ Param Hash hash table * @ Param hashlength * @ Param data */public static void inserthash (INT [] hash, int hashlength, int data) {// hash function int hashaddress = data % hashlength; // if the key exists, it indicates it has been occupied by others, at this time, the conflict must be resolved while (hash [hashaddress]! = 0) {// use the open addressing method to find hashaddress = (++ hashaddress) % hashlength;} // store data in the dictionary hash [hashaddress] = data ;}} running result: it has been found in the array and the index location is: 3`

Related Keywords:

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

• Sales Support

1 on 1 presale consultation

• After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

• Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.