Data structure is a hash table (hashTable)

Last Update:2016-06-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A hashtable, also known as a hash table, is a data structure that is accessed directly from a keyword value (key value). That is, it accesses the record by mapping the keyword value to a location to speed up the lookup. This mapping function is called a hash function (also known as a hash function), and the mapping process is called hashing, and the array that holds the record is called the hash table. For example, we can use the following method to map the keyword to the subscript of the array: arrayindex = hugenumber% arraySize.

Hashiha will inevitably produce a problem, that is, for different keywords, may get the same hash address, that is, the same array subscript, this phenomenon is called conflict, then how can we deal with the conflict? One method is to open the address method, that is, to find another slot in the array by means of the system, fill in the data, and no longer have the array subscript with the hash function, because the location already has data, and the other is to create an array of linked lists, where the data is not stored directly in the array, so that when a conflict occurs, The new data item is directly connected to the linked list referred to in the array subscript, which is called the chain address method. The following is a discussion of these two methods.

1. Open Address LawLinear detection method

The so-called linear detection, that is, the linear search blank unit. If 21 is where the data is to be inserted, but it is already occupied, then it is 22, then 23, and so on. The array subscript is incremented until a blank bit is found. The following is a hash table implementation code based on the linear probing method:

public class HashTable {private dataitem[] hasharray;//Dateitem class is a data item that encapsulates data information private int arraysize=10;private int Itemnum ; How many items are currently stored in the array private DataItem nonitem; Public HashTable () {Hasharray = new Dataitem[arraysize];nonitem = new DataItem ( -1) for deleting an item;//deleted Item key is-1}pub Lic Boolean isfull () {return (Itemnum = = arraySize);} public Boolean isEmpty () {return (Itemnum = = 0);} public void Displaytable () {System.out.print ("Table:"), for (int j = 0; J < ArraySize; J + +) {if (hasharray[j]! = null) {System.out.print (Hasharray[j].getkey () + "");} else {System.out.print ("* *");}} System.out.println ("");} public int hashfunction (int key) {return key% ArraySize;//hash function}public void Insert (DataItem item) {if (Isfull () ) {//Extension hash table System.out.println ("Hash table full, re-hash ..."); Extendhashtable ();} int key = Item.getkey (); int hashval = Hashfunction (key); while (hasharray[hashval]! = NULL && HASHARRAY[HASHVAL].G Etkey ()! =-1) {++hashval;hashval%= arraySize;} Hasharray[hashval] = Item;itemnum++;} /* * Array has a fixed size and cannot be extended, so the extended hash table can only create another larger array, and then insert the data from the old array into the new array. * However, the hash table calculates the location of the given data based on the size of the array, so these items can no longer be placed in the new array in the same position as the old array, so they cannot be copied directly, they need to traverse the old array sequentially, * and insert each data item into the new array using the Insert method. This is called re-hashing. This is a time-consuming process, but this process is necessary if the array is to be extended. */public void Extendhashtable () {///Extended hash table int num = Arraysize;itemnum = 0;//re-count, because the following is to transfer the original data to the new expanded array arraySize *= 2; /array size doubled dataitem[] Oldhasharray = Hasharray;hasharray = new Dataitem[arraysize];for (int i = 0; i < num; i++) {Insert (o Ldhasharray[i]);}} Public DataItem Delete (int key) {if (IsEmpty ())} {System.out.println ("Hash table is empty!"); return null;}  int hashval = hashfunction (key), while (hasharray[hashval]! = null) {if (Hasharray[hashval].getkey () = = key) {DataItem Temp = Hasharray[hashval];hasharray[hashval] = Nonitem; Nonitem represents an empty item whose key is -1itemnum--;return temp;} ++hashval;hashval%= arraySize;} return null;} Public DataItem find (int key) {int hashval = hashfunction (key), while (hasharray[hashval]! = null) {if (Hasharray[hashval]. GetKey () = = key) {return Hasharray[hasHval];} ++hashval;hashval%= arraySize;} return null;}} Class DataItem {private int idata;public DataItem (int data) {iData = data;} public int GetKey () {return iData;}}

There is a drawback to linear probing, where data may be aggregated. Once the aggregation is formed, it becomes bigger and larger, and the data items that are hashed and fall within the aggregation range are moved in step, and inserted at the end of the aggregation, so that the aggregation becomes larger. The larger the aggregation, the faster it will grow. This results in a portion of the hash table containing a large number of aggregates, while the other part is sparse.

to solve this problem, we can use two probes: Two probes are a way to prevent aggregation, and the idea is to detect units that are far apart, rather than those that are adjacent to the original location. In linear probing, if the original subscript computed by the hash function is x, the linear probe is x+1, x+2, x+3, and so on, whereas in two probes, the detection process is x+1, x+4, X+9, x+16, and so on, and the distance from the original position is the square of the step number. Two probes eliminate the original aggregation problem, but produce another finer aggregation problem, called two aggregates: for example, 184,302,420 and 544 are inserted into the table sequentially, their mappings are 7, then 302 need to be measured in steps of 1, 420 need to be measured in steps of 4, 544 requires 9 for step detection. As long as one of its keywords is mapped to 7, a longer step detection is required, a phenomenon called two aggregates. Two aggregation is not a serious problem, but two probes are not often used because there are good workarounds, such as hashing.

re-hash method

One way to eliminate primitive aggregation and two aggregates is to produce a probe sequence that relies on a keyword, rather than every keyword. That is, different keywords can use different probe sequences even if they map to the same array subscript. Re-hashing is the key word with a different hash function again hash, with this result as the step, for the specified keyword, step in the entire probe is invariant, different keywords using different steps, experience, the second hash function must have the following characteristics:

1. Unlike the first hash function;

2. Cannot output 0 (otherwise there is no step, each exploration is in situ, the algorithm will enter the dead loop).

Experts have found that the following form of hash function works very well: stepsize = constant-key% constant; Where constant is prime and less than the array capacity.

The re-hashing method requires that the capacity of the table be a prime number, if the table length is 15 (0-14), non-prime, there is a specific keyword mapped to 0, the step is 5, then the probe sequence is 0,5,10,0,5,10, and so on has been circulating. The algorithm only tries these three units, so it is impossible to find some empty cells, and the final algorithm causes crashes. If the array size is 13, prime, the probe sequence will eventually access all the cells. That is, 0,5,10,2,7,12,4,9,1,6,11,3, go down, as long as there is a vacancy in the table, it can be detected. Here's a look at the code for the hashing method:

public class Hashtabledouble {private dataitem[] hasharray;private int arraysize;private int itemnum;private DataItem non Item;public hashtabledouble () {arraySize = 13;hasharray = new Dataitem[arraysize];nonitem = new DataItem (-1);} public void Displaytable () {System.out.print ("Table:"), for (int i = 0; i < arraySize; i++) {if (hasharray[i]! = null) {System.out.print (Hasharray[i].getkey () + "");} else {System.out.print ("* *");}} System.out.println ("");} public int hashFunction1 (int key) {//First hash Functionreturn key% ArraySize;} public int hashFunction2 (int key) {//second hash Functionreturn 5-key% 5;} public Boolean isfull () {return (Itemnum = = arraySize);} public Boolean isEmpty () {return (Itemnum = = 0);} public void Insert (DataItem item) {if (Isfull ()) {System.out.println ("hash table full, re-hash ..."); Extendhashtable ();} int key = Item.getkey (); int hashval = HashFunction1 (key); int stepsize = HashFunction2 (key); To calculate the probe steps with HashFunction2 while (hasharray[hashval]! = NULL && hasharray[haShval].getkey ()! =-1) {hashval + = Stepsize;hashval%= arraySize;//probe backwards with the specified number of steps}hasharray[hashval] = item;itemnum++;} public void extendhashtable () {int num = Arraysize;itemnum = 0;//re-count, because the following is to transfer the original data to the new expanded array arraySize *= 2;//array size doubled dat aitem[] Oldhasharray = Hasharray;hasharray = new Dataitem[arraysize];for (int i = 0; i < num; i++) {Insert (Oldhasharray [i]);}} Public DataItem Delete (int key) {if (IsEmpty ())} {System.out.println ("Hash table is empty!"); return null;} int hashval = HashFunction1 (key), int stepsize = HashFunction2 (key), while (hasharray[hashval]! = null) {if (Hasharray[hash Val].getkey () = = key) {DataItem temp = hasharray[hashval];hasharray[hashval] = Nonitem;itemnum--;return temp;} Hashval + = Stepsize;hashval%= arraySize;} return null;} Public DataItem find (int key) {int hashval = HashFunction1 (key), int stepsize = HashFunction2 (key), while (Hasharray[hashva L] = null) {if (Hasharray[hashval].getkey () = = key) {return hasharray[hashval];} Hashval + = Stepsize;hashval%= arraysize;} return null;}} Class DataItem {private int idata;public DataItem (int data) {iData = data;} public int GetKey () {return iData;}}

2. Link Address method

In open address law, by re-hashing to find a vacancy to resolve the conflict, another way is to set up a list (that is, the chain address method) in each cell of a hash table, the key value of a data item or a cell that is mapped to a hash table as usual, and the data item itself is inserted into the list of the cell. Other data items that are also mapped to this location need to be added to the list only, and do not need to look for empty spaces in the original array. Here's a look at the code for the link address method:

public Class Hashtabledouble {private sortedlist[] hasharray;//The array holds the list private int arraysize;public hashtabledouble (int size) { ArraySize = Size;hasharray = new sortedlist[arraysize];//new out each empty list initializes an array for (int i = 0; i < arraySize; i++) {hasharray[ I] = new SortedList ();}} public void Displaytable () {for (int i = 0; i < arraySize; i++) {System.out.print (i + ":"); Hasharray[i].displaylist (); }}public int hashfunction (int key) {return key% ArraySize;} public void Insert (Linknode node) {int key = Node.getkey (); int hashval = Hashfunction (key); Hasharray[hashval].insert ( node); Add directly to the linked list}public linknode Delete (int key) {int hashval = hashfunction (key); Linknode temp = find (key), Hasharray[hashval].delete (key);//Find the data item to delete from the list, delete the return temp directly;} Public Linknode find (int key) {int hashval = hashfunction (key); Linknode node = hasharray[hashval].find (key); return node;}}

The linked list used here is an ordered list Linknode

public class SortedList {private Linknode first;      Public SortedList () {first = null;      public Boolean isEmpty () {return (first = = null);          } public void Insert (Linknode node) {int key = Node.getkey ();          Linknode previous = null;          Linknode current = first;              while (current! = null && Current.getkey () < key) {previous = current;          current = Current.next;          } if (previous = = null) {first = node;              } else {node.next = current;          Previous.next = node;          }} public void Delete (int key) {Linknode previous = null;          Linknode current = first;              if (IsEmpty ()) {System.out.println ("chain is empty!");          Return              } while (current! = null && current.getkey ()! = key) {previous = current; current = Current.next;          } if (previous = = null) {first = First.next;          } else {previous.next = Current.next;          }} public Linknode find (int key) {Linknode-current = first;                  while (current = null && Current.getkey () <= key) {if (Current.getkey () = = key) {              return current;          } current = Current.next;      } return null;          } public void Displaylist () {System.out.print ("List (first->last):");          Linknode current = first;              while (current = null) {Current.displaylink ();          current = Current.next;      } System.out.println ("");      }} class Linknode {private int iData;      Public Linknode Next;      public linknode (int data) {iData = data;      } public int GetKey () {return iData; } public void DisplayLink () {System.out.Print (IData + ""); }  }

in the absence of a conflict, the execution of the INSERT and delete operations in the hash table can reach the time level of O (1).

Data structure is a hash table (hashTable)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Data structure is a hash table (hashTable)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Data structure is a hash table (hashTable)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support