[Source code] hashtable source code analysis

Last Update:2014-08-15 Source: Internet

Author: User

Tags rehash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note: The following source code is based on jdk1.7.0 _ 11
I analyzed the source code of hashmap in the previous article. I believe everyone has a deeper understanding of hashmap. This article introduces another common class of map set, hashtable. Hashtable comes out earlier than hashmap, and hashmap 1.2 is available, and hashtable has appeared in 1.0. The implementation principles of hashmap and hashtable are basically the same. They are implemented through hash tables. In addition, the two methods to deal with conflicts are also the same, both through the linked list method. Next we will introduce this class in detail. First, let's look at the class declaration:

public class Hashtable<K,V>    extends Dictionary<K,V>    implements Map<K,V>, Cloneable, java.io.Serializable

Hashtable does not inherit abstractmap. Instead, it chooses to inherit the dictionary class. dictionary is an obsolete abstract class. The document has made it clear:

NOTE: This class is obsolete.  New implementations should * implement the Map interface, rather than extending this class.

The method of this class is as follows (all abstract methods ):

public abstractclass Dictionary<K,V> {      public Dictionary() {    }    abstract public int size();    abstract public boolean isEmpty();    abstract public Enumeration<K> keys();    abstract public Enumeration<V> elements();    abstract public V get(Object key);    abstract public V put(K key, V value);    abstract public V remove(Object key);}

Let's talk about nothing. Let's look at the hashtable Source Code directly. First, it is still a member variable:

Private transient entry <K, V> [] Table; // bucket array for storing key-value pairs/*** the total number of entries in the hash table. * Total number of key-value pairs */private transient int count;/*** the table is rehashed when its size exceeds this threshold. (The * value of this field is (INT) (capacity * loadfactor ).) * capacity threshold. exceeding this threshold will result in capacity expansion. The value is capacity * load factor */private int threshold;/*** the load factor for the hashtable. * load factor */private float loadfactor;/*** number of times that hashtable is changed, used for fast Failure Mechanism */private transient int modcount = 0;

Member variables are similar to hashmap, but hashmap is more standard. Some constants are defined in hashmap, such as the default load factor, default capacity, and maximum capacity.The following is the constructor:

Public hashtable (INT initialcapacity, float loadfactor) {// you can specify the initial capacity and loading factor if (initialcapacity <0) throw new illegalargumentexception ("illegal capacity:" + initialcapacity ); if (loadfactor <= 0 | float. isnan (loadfactor) throw new illegalargumentexception ("illegal load:" + loadfactor); If (initialcapacity = 0) initialcapacity = 1; // the minimum initial capacity is 1 This. loadfactor = loadfactor; Table = new entry [initialcapac Ity]; // create a bucket array Threshold = (INT) math. min (initialcapacity * loadfactor, max_array_size + 1); // initialize the capacity threshold usealthashing = sun. misc. VM. isbooted () & (initialcapacity> = holder. alternative_hashing_threshold);}/*** constructs a new, empty hashtable with the specified initial capacity * and default load factor (0.75 ). */Public hashtable (INT initialcapacity) {This (initialcapacity, 0.75f); // The default load factor is 0.75} Public hashtable () {This (11, 0.75f); // The default capacity is 11, load Factor: 0.75}/*** constructs a new hashtable with the same mappings as the given * map. the hashtable is created with an initial capacity sufficient to * Hold the Mappings in the given map and a default load factor (0.75 ). */Public hashtable (Map <? Extends K ,? Extends v> T) {This (math. Max (2 * T. Size (), 11), 0.75f); putall (t );}

Notes: 1. The default hashtable capacity is 11, and the default load factor is 0.75. (the default hashmap capacity is 16, and the default load factor is 0.75) 2. The hashtable capacity can be any integer and the minimum value is 1, while the hashmap capacity is always the N power of 2. 3. To avoid performance problems caused by expansion, we recommend that you specify a reasonable capacity.In addition, we can see that the hashtable encoding is not very standard compared with hashmap, And the constructor appears Hard CodingWhile hashmap defines constants. Like hashmap, hashtable also has a static class named entry, which is actually a key-Value Pair object and stores the key-value reference. It can also be understood as a single-linked table node because it holds a reference to the next entry object:

Private Static class entry <K, V> implements map. entry <K, V> {// key value pair object int hash; // hash value final K key; // key V value; // value entry <K, v> next; // point to the next protected entry (INT hash, K key, V value, entry <K, V> next) {This. hash = hash; this. key = key; this. value = value; this. next = next;} protected object clone () {// clone return New entry directly using new method <> (hash, key, value, (next = NULL? Null: (Entry <K, V>) Next. clone ();} // map. entry ops public K getkey () {return key;} public v getvalue () {return value;} public v setvalue (V value) {// configurable value: If (value = NULL) throw new nullpointerexception (); V oldvalue = This. value; this. value = value; return oldvalue;} public Boolean equals (Object O) {If (! (O instanceof map. Entry) return false; map. Entry <?,?> E = (map. entry) O; Return key. equals (E. getkey () & value. equals (E. getvalue ();} public int hashcode () {return hash ^ value. hashcode ();} Public String tostring () {return key. tostring () + "=" + value. tostring ();}}

Again, hashmap and hashtable store key-Value Pair objects instead of separate keys or values. After specifying the storage method, let's look at the put and get methods:

Public synchronized v put (K key, V value) {// Add a key-value pair to the hash table // make sure the value is not null if (value = NULL) {// ensure that the value cannot be null throw new nullpointerexception ();} // makes sure the key is not already in the hashtable. entry tab [] = table; int hash = hash (key); // generate a hash value based on the key ----> If the key is null, this method throws an exception int Index = (hash & 0x7fffffff) % tab. length; // locate the storage location through the hash value (Entry <K, V> E = tab [Index]; e! = NULL; E = E. next) {/If (E. hash = hash) & E. key. equals (key) {// if the key is the same, the new value overwrites the old value v old = E. value; E. value = value; return old ;}} modcount ++; If (count> = threshold) {// The current capacity exceeds the threshold. You need to expand // rehash the table if the threshold is exceeded rehash (); // re-construct the bucket array and re-hash all key values in the array. This takes time! Tab = table; hash = hash (key); Index = (hash & 0x7fffffff) % tab. length; // here is the touch operation} // creates the new entry. entry <K, V> E = tab [Index]; // Insert the new node to the first tab [Index] = new entry <> (hash, key, value, E) of the linked list ); // generate a new node count ++; return NULL ;}

Notes: 1. hasbtable does not allow null values and keys. If it is null, a null pointer is thrown. you may be wondering that the put method only judges the value at the beginning and does not judge the key. Here I think it is the negligence of the designer. Of course, this does not affect usage, because when the hash method is called, if the key is null, a null pointer exception will still be thrown:

Private int Hash (Object K) {If (usealthashing) {If (K. getclass () = string. class) {return sun. misc. hashing. stringhash32 (string) k);} else {int H = hashseed ^ K. hashcode (); H ^ = (H >>> 20) ^ (H >>> 12); Return H ^ (H >>> 7) ^ (H >>> 4) ;}} else {return K. hashcode (); // a null pointer exception may be thrown here }}

2. The hashmap index calculation method is H & (length-1), while hashtable uses a modulo operation, which is less efficient than hashmap. 3. In addition, when hashtable calculates the index, it first compares the hash value with 0x7fffff to ensure that the hash value is always a positive number. 4. note that this method includes the synchronized keyword added to several methods mentioned below, which means that this hashtable is a thread-safe class, which is also the biggest difference between it and hashmap.Next we will look at the expansion method rehash:

Protected void rehash () {int oldcapacity = table. length; // record the old capacity entry <K, V> [] oldmap = table; // record the old bucket array // overflow-conscious code int newcapacity = (oldcapacity <1) + 1; // The new capacity is 2 times of the old capacity plus 1 If (newcapacity-max_array_size> 0) {If (oldcapacity = max_array_size) // The capacity cannot exceed the agreed maximum value // keep running with max_array_size buckets return; newcapacity = max_array_size;} entry <K, V> [] NEWMAP = new entry [newcapacit Y]; // create a new array modcount ++; Threshold = (INT) math. min (newcapacity * loadfactor, max_array_size + 1); Boolean currentalthashing = usealthashing; usealthashing = sun. misc. VM. isbooted () & (newcapacity> = holder. alternative_hashing_threshold); Boolean rehash = currentalthashing ^ usealthashing; Table = NEWMAP; For (INT I = oldcapacity; I --> 0 ;) {// transfer key-value pairs to the new array for (Entry <K, V> old = oldmap [I]; old! = NULL;) {entry <K, V> E = old; old = old. next; If (rehash) {e. hash = hash (E. key);} int Index = (E. hash & 0x7fffffff) % newcapacity; E. next = NEWMAP [Index]; NEWMAP [Index] = e ;}}}

During each expansion of hashtable, the capacity is increased by two times of the original capacity, while hashmap is doubled of the original capacity.
Next, analyze the get method:

Public synchronized v get (Object key) {// retrieve the corresponding index entry tab [] = table based on the key; int hash = hash (key ); // calculate the hash value int Index = (hash & 0x7fffffff) % tab based on the key. length; // locate the index for (Entry <K, V> E = tab [Index]; e! = NULL; E = E. next) {// traverse the entry chain if (E. hash = hash) & E. key. equals (key) {// if this key is found, Return e. value; // return the corresponding value} return NULL; // otherwise return NULL}

Of course, if the parameter you pass is null, a null pointer will be thrown. At this point, the most important part has been completed. The following describes some common methods:

Public synchronized v remove (Object key) {// Delete the entry tab [] = table; int hash = hash (key ); // calculate the hash value int Index = (hash & 0x7fffffff) % tab. length; // calculate the index for (Entry <K, V> E = tab [Index], Prev = NULL; e! = NULL; Prev = E, E = E. next) {// traverse the entry chain if (E. hash = hash) & E. key. equals (key) {// find the specified key modcount ++; If (prev! = NULL) {// modify the relevant pointer Prev. next = E. next;} else {tab [Index] = E. next;} count --; V oldvalue = E. value; E. value = NULL; return oldvalue;} return NULL ;}

Public synchronized void clear () {// clear the bucket array entry tab [] = table; modcount ++; For (INT Index = tab. length; -- index> = 0;) tab [Index] = NULL; // directly Leave null COUNT = 0 ;}

The following describes how to obtain its keyset and entryset:

Public set <k> keyset () {If (keyset = NULL) // Through collections packaging, the returned keyset is a thread-safe keyset = collections. synchronizedset (New keyset (), this); Return keyset;} public set <map. entry <K, V> entryset () {If (entryset = NULL) // Through collections packaging, the returned thread-safe key-value set entryset = collections. synchronizedset (New entryset (), this); Return entryset ;}

The keyset and entryset are two internal classes of hashtable:

 private class KeySet extends AbstractSet<K> {        public Iterator<K> iterator() {            return getIterator(KEYS);        }        public int size() {            return count;        }        public boolean contains(Object o) {            return containsKey(o);        }        public boolean remove(Object o) {            return Hashtable.this.remove(o) != null;        }        public void clear() {            Hashtable.this.clear();        }    }

Summary: 1. hashtable is a thread-safe class (hashmap thread security ); 2. hasbtable does not allow null values and keys. If it is null, a null pointer will be thrown (hashmap can ); 3. hashtable does not allow duplicate keys. If duplicate keys exist, the newly inserted value overwrites the old value (same as hashmap ); 4. hashtable uses the linked list method to resolve conflicts; 5. When hashtable calculates an index based on hashcode, it first compares the hashcode value with 0x7fffffff to ensure that the hash value is always positive; 6. The hashtable capacity is any positive number (the minimum is 1), while the hashmap capacity is always the N power of 2. The default hashtable capacity is 11, and the default hashmap capacity is 16; 7. Each time hashtable is expanded, the new capacity is twice the old capacity plus 2, while hashmap is twice the old capacity; 8. The default load factors of hashtable and hashmap are both 0.75;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More