Java data structure and algorithm parsing (12)--hash list __ Storage

Source: Internet
Author: User
Tags time limit
Java data structure and algorithm parsing (12)--hash list Hash List Overview

A hash table is a structure in which data is stored in key-value (key-indexed), and we can find its corresponding value as long as we enter the value that we want to find, the key.

The idea of a hash table is simple, if all the keys are integers, then a simple unordered array can be used: The key is indexed and the value is its corresponding value, so you can quickly access the value of any key. This is the case for a simple key, which we extend to a key that can handle more complex types.

The hash lookup algorithm has two steps:
1. Use the hash function to convert the lookup key to the index of the array. Ideally, different keys will be converted to different index values, but in some cases we need to handle multiple keys being hashed to the same index value. So the second step in hashing is to handle collision conflicts.
2. Dealing with collision conflicts. There are many methods to deal with hash collision conflict, which are mainly divided into zipper method and linear detection method.

A hash table is a classic example of a trade-off between time and space. If there is no memory limit, you can directly index the key as an array. Then all the search time complexity is O (1), if there is no time limit, then we can use unordered array and order lookup, which requires very little memory. The hash table uses a modest amount of time and space to find a balance between the two extremes. You just need to adjust the hash function algorithm to make trade-offs in time and space.

The hash function is related to the type of the key. For each type of key we need a hash function corresponding to it. hash Function 1. Positive integers

The most common way to get a positive integer hash value is to use the method of excluding residue. That is, for an array of prime m of the size, for any positive integer k, the remainder of K divided by M is computed. M general Prime. 2. String

When a string is used as a key, we can also use it as a large integer, using a retention method. We can take each character that makes up a string and hash it out.

char[] s = str. ToCharArray ();
    int hash = 0;
    for (int i = 0; i < s.length i++)
    {
        hash = S[i] + (hash); 
    }
    return hash;
    
    
     
     1
     
     2
     
     3
     
     4
     
     5
     
     6 7
    
    

The hash value above is the Horner method for calculating the hash value of the string, which is:

   H = s[0] 31l–1 + ... + s[l–3] 312 + s[l–2] 311 + s[l–1] 310
    
    
     
     1
    
    

For example, for example, to get "call" hash value, string C corresponds to Unicode for 99,a corresponding Unicode to 97,l corresponding to Unicode 108, so the string "call" hash value is 3045982 = 99 313 + 97 312 + 108 311 + 108 310 = 108 + 31 · (108 + 31 · (97 + 31 · (99)))

If hashing a value on each character can be time-consuming, you can save time by taking an interval of n characters to get the hash value, for example, to get a hash value of every 8-9 characters:

char[] s = str. ToCharArray ();
    int hash = 0;
    int skip = Math.max (1, S.LENGTH/8);
    for (int i = 0; i < s.length I+=skip)
    {
        hash = S[i] + (hash);
    }
    return hash;
    
    
     
     1
     
     2
     
     3
     
     4
     
     5
     
     6 7 8
    
    
3.Double Type
@Override public
int hashcode () {return 
  Double.hashcode (value);
}
public static int hashcode (double value) { 
  Long bits = doubletolongbits (value); 
  return (int) (bits ^ (bits >>>));


    
    
     
     1
     
     2
     
     3
     
     4
     
     5
     
     6 7 8 9 10
    
    

The Hashcode method of the double class first converts its value to a long type, and then returns a lower 32-bit and a 32-bit XOR result as a hashcode. 4. Non-numeric type objects

The data types we described earlier can be considered as a numeric type (string can be considered an integer array), so how do we calculate the hashcode of a Non-numeric type object, here we take the date class as an example to briefly introduce. The Hashcode method for the date class is as follows:

public int hashcode () { 
  Long ht = This.gettime (); 
  return (int) HT ^ (int) (HT >>);
}
    
    
     
     1
     
     2
     
     3
     
     4
    
    

The implementation of its Hashcode method is simple, but returns the 32-bit and 32-bit differences or results of the time encapsulated by the date object. From the implementation of the hashcode of the date class, we can see that for the hashcode of the non-numeric type, we need to select some instance fields that can distinguish each class instance as the factor of computation. For example, for the date class, the date objects, which usually have the same time, we think they are equal and therefore have the same hashcode. Here we need to explain that for equivalent two objects (that is, calling the Equals method returns True), they must have the same hashcode, and vice versa. using zipper method to handle collision

The second step in the hashing algorithm is collision processing, which is to handle the same hash values of two or more keys.

With the hash function, we can convert the key to an array of indexes (0-M-1), but for two or more keys with the same index value, we need a way to handle this conflict.

A more straightforward approach is to point each element of an array of size m to a list of linked lists, where each node in the list stores the key value pairs for that index, which is the zipper method. &NBSP
 
The basic idea of this method is to select a large enough m so that all the linked lists are as short as possible to ensure the efficiency of the lookup. To use the Zipper method to find a hash of the search is divided into two steps, first of all, based on the hash value to find the list of reprehensible, and then follow the order of the linked list to find the corresponding keys. implementation of the Zipper method

public class Seperatechaininghashset<k, v> {private int num;//The total number of key value pairs in the current hash list private int capacity;//Hash list size Private seqsearchst<k, v>[] St;
        The list Object array public seperatechaininghashset (int initialcapacity) {capacity = Initialcapacity;
        St = (seqsearchst<k, v>[]) new object[capacity];
        for (int i = 0; I < capacity i++) {St[i] = new seqsearchst<> ();
    } private int hash (K key) {return (Key.hashcode () & 0x7fffffff)% capacity;
    Public V get (K key) {return St[hash (key)].get (key);
    public void put (K key, V value) {St[hash (key)].put (key, value);
     
     } 1 2 3 4 5 6 7 8
     
     9 10 11 12 13 14 15 16 17
     18 19 20 21
     22 23 24 25 26 27
    
     

In the above implementation, we fixed the capacity of the hash table, and when we explicitly knew that the number of key values we were inserting could only reach the constant number of buckets, the fixed capacity was entirely feasible. But if the number of key pairs grows far larger than the number of buckets, we need the ability to dynamically adjust capacity. In fact, the ratio of the key-value logarithm in the hash table to the capacity is called the load factor (factor). Usually the smaller the load factor, the shorter the time we need to find, and the greater the use of the space, and the larger the load factor, the longer the lookup time, but the smaller the space usage. For example, the HashMap in the Java Standard library is a hash table based on the Zipper method, which has a default load factor of 0.75. HashMap the way to dynamically adjust capacity is based on the formula Loadfactor = maxsize/capacity, where maxSize is the maximum number of key values that support storage, Loadfactor and capacity (capacity) are either specified by the user or given a default value by the system at initialization time. When the number of key-value pairs in the HashMap reaches MaxSize, the capacity in the hash table is increased.

Seqsearchst is also used in the above code, which is actually a symbolic table implementation based on the linked list.

public class Seqsearchst<k, v> {private Node A;
            Private class Node {K key;
            V Val;
            Node Next;
                Public Node (K key, V Val, Node next) {This.key = key;
                This.val = val;
            This.next = Next;
                } public V get (K key) {for (node node =!= null; node = node.next) {
                if (Key.equals (Node.key)) {return node.val;
        } return null;
            The public void put (K key, V val) {//first finds that the corresponding key node node exists in the table; for (node =!= null; node = node.next) {if (Key.equals (Node.key)) {node.
                    val = val;
                Return
        }//The table does not have the corresponding key, new Node (Key, Val, and a).
     
  }} 1   2 3 4 5 6 7 8 9 10 1
     
      1 12 13 14 15 16 17

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.