What's the difference between HashMap and Hashtable? During the interview and the interview process, I asked also was asked this question, also met a lot of answers, today decided to write the ideal answer in my mind.
Code version
Each version of the JDK is being improved. The HashMap and Hashtable discussed in this article are based on the JDK 1.7.0_67. SOURCE See here
1. Time
Hashtable is generated in JDK 1.1, and HashMap is generated in JDK 1.2. Judging by the dimension of Time, HashMap appears later than Hashtable.
2. Author
The following are the Hashtable
The following code and comments are from java.util.HashTable
* @author Arthur van Hoff
* @author Josh Bloch
* @author Neal Gafter
The following are the HashMap
The following code and comments are from Java.util.HashMap
* @author Doug Lea
* @author Josh Bloch
* @author Arthur van Hoff
* @author Neal Gafter
You can see the author of the HashMap more than the great God Doug Lea. If you don't know Doug Lea, look here.
3. External interface (API)
Both HashMap and Hashtable are tool classes that implement key-value mappings based on a hash table. To discuss their differences, let's start by looking at how their exposed APIs are different.
3.1 Public Method
In the following two graphs, I draw the class inheritance system for HashMap and Hashtable, and list the public methods that are available for external invocation of these two classes.
It can be seen that the inheritance system of two classes is somewhat different. Although all implementations of the map, Cloneable, serializable three interfaces. But HashMap inherits from the abstract class Abstractmap, and Hashtable inherits from the abstract class dictionary. Where the dictionary class is an already obsolete
The following code and comments are from java.util.Dictionary
* <strong>note:this class is obsolete. New implementations should
* Implement the Map interface, rather than extending this class.</strong>
At the same time we see that Hashtable has two more public methods than HashMap. One is elements, which comes from abstract class dictionary, which is useless because the class is obsolete. The other way out is contains, this extra method is useless, because it is the same as the Containsvalue method function. Code for Proof:
The following code and comments are from java.util.HashTable
Public synchronized Boolean contains (Object value) {
if (value = = null) {
throw new NullPointerException ();
}
Entry tab[] = table;
for (int i = tab.length; i--> 0;) {
for (entry<k,v> e = tab[i]; E! = null; e = e.next) {
if (e.value.equals (value)) {
return true;
}
}
}
return false;
}
public boolean Containsvalue (Object value) {
return contains (value);
}
So from a public approach, the two classes provide the same functionality. All provide the service of the key value mapping, can increase, delete, check, change the key value pair, can provide the Traverse view to the construction, the value, the key value pair. Support for shallow copy, support serialization.
3.2 null Key & NULL Value
HashMap supports null keys and null values, and Hashtable throws a NullPointerException exception when it encounters null. This is not because the Hashtable has any special implementation-level reasons that cannot support null keys and null values, only because HashMap does special handling of NULL at implementation time, setting the hashcode value of NULL to 0, It is then stored in the No. 0 bucket of the hash table. Let's take a look at the details of the code in a put method:
The following code and comments are from java.util.HashTable
Public synchronized v put (K key, V value) {
If value is null, throw NullPointerException
if (value = = null) {
throw new NullPointerException ();
}
If key is null, the NullPointerException is thrown when calling Key.hashcode ()
// ...
}
The following code and comments are from Java.util.HasMap
Public V put (K key, V value) {
if (table = = empty_table) {
Inflatetable (threshold);
}
When key is null, call Putfornullkey special handling
if (key = = null)
return Putfornullkey (value);
// ...
}
Private v Putfornullkey (v value) {
When key is null, it is placed in the No. 0 bucket of table[0]
for (entry<k,v> e = table[0]; E! = null; e = e.next) {
if (E.key = = null) {
V oldValue = E.value;
E.value = value;
E.recordaccess (this);
return oldValue;
}
}
modcount++;
AddEntry (0, NULL, value, 0);
return null;
}
4. Principle of implementation
This section discusses what is different between HashMap and Hashtable at the data structure and algorithm level.
4.1 Data Structures
Both HashMap and Hashtable use a hash table to store key-value pairs. The data structure is essentially the same, creating a private inner class entry that inherits from Map.entry, each entry object represents a key-value pair stored in a hash table.
The entry object uniquely represents a key-value pair with four properties:
-K Key Object
-V Value Object
Hash value of-int hash Key object
-entry Entry points to the next Entry object in the list, which can be null, indicating that the current Entry object is at the end of the list
It can be said, how many key value pairs, how many entry objects, then in HashMap and Hashtable is how to store these entry objects, so that we quickly find and modify it? Please look.
The drawing is a bucket number of 8, there are 5 key-value pairs of hashmap/hashtable memory layout situation. You can see that hashmap/hashtable internally creates a reference array of type entry that represents the hash table, the length of the array, that is, the number of hash buckets. And each element of the array is a entry reference, from the properties of the entry object, it can be seen that it is a node of a linked list, and each entry object contains a reference to another entry object.
This concludes that Hashmap/hashtable internally uses a entry array to implement the hash table, and for key-value pairs that map to the same hash bucket (the same position in the array), use the entry linked list to store (resolve hash collisions).
The following code and comments are from java.util.HashTable
/**
* The hash table data.
*/
Private transient entry<k,v>[] table;
The following code and comments are from Java.util.HashMap
/**
* The table, resized as necessary. Length must always be a power of.
*/
Transient entry<k,v>[] table = (entry<k,v>[]) empty_table;
As you can see from the code, the implementation of the two classes is consistent for the internal representation of a hash bucket.
4.2 Algorithm
The previous section has already said the internal data structure used to represent the hash table. Hashmap/hashtable also needs to have an algorithm to map the given key keys to the determined hash bucket (array position). You need an algorithm that expands the size of the hash table (the size of the array) when the key-value pairs in the hash bucket are to a certain extent. This section compares how the two classes differ at the algorithm level.
The initial capacity size differs from the size of each expansion capacity. Look at the code first:
The following code and comments are from java.util.HashTable
Hash table default Initial size is 11
Public Hashtable () {
This (one, 0.75f);
}
protected void Rehash () {
int oldcapacity = Table.length;
entry<k,v>[] Oldmap = table;
Each expansion is the original 2n+1
int newcapacity = (oldcapacity << 1) + 1;
// ...
}
The following code and comments are from Java.util.HashMap
Hash table default Initial size is 2^4=16
static final int default_initial_capacity = 1 << 4; aka 16
void AddEntry (int hash, K key, V value, int bucketindex) {
Each expansion is the original 2n
if (size >= threshold) && (null! = Table[bucketindex])) {
Resize (2 * table.length);
}
You can see that the default initial size of Hashtable is 11, then each expansion is the original 2n+1. HashMap The default initialization size is 16, then each expansion is twice times the original. And I'm not listing the code, that is, if the initialization size is given at the time of creation, then Hashtable will directly use the size you given, and HashMap will expand it to a power of 2.
That is to say Hashtable will try to use prime, odd. HashMap always uses the power of 2 as the size of the hash table. We know that when the size of the hash table is a prime number, the simple modulo hash result will be more uniform (see this article for specific proof), so the Hashtable hash table size selection seems to be more sophisticated. But on the other hand we know that in the modulus calculation, if the modulus is a power of 2, then we can directly use the bitwise operation to obtain the result, the efficiency is much higher than doing division. So from the efficiency of the hash calculation, but also hashmap better.
So, the fact is that hashmap to speed up the hash, the size of the hash table is fixed to a power of 2. Of course, this introduces the problem of uneven hash distribution, so hashmap to solve this problem, and the hash algorithm has made some changes. Let's take a look at how Hashtable and HashMap, after acquiring the hashcode of the key object, hash them into a deterministic hash bucket (the entry array position).
The following code and comments are from java.util.HashTable
Hash cannot exceed integer.max_value so take its minimum of 31 bit
int hash = hash (key);
int index = (hash & 0x7FFFFFFF)% Tab.length;
Direct Calculation key.hashcode ()
private int hash (Object k) {
Hashseed would be the zero if alternative hashing is disabled.
return hashseed ^ K.hashcode ();
}
The following code and comments are from Java.util.HashMap
int hash = hash (key);
int i = indexfor (hash, table.length);
After calculating the Key.hashcode (), some bit operations are done to reduce the hash conflict
Final int hash (Object k) {
int h = hashseed;
if (0! = h && k instanceof String) {
Return Sun.misc.Hashing.stringHash32 ((String) k);
}
H ^= K.hashcode ();
This function ensures, hashcodes that differ
Constant multiples at each bit position has a bounded
Number of collisions (approximately 8 at default load factor).
H ^= (H >>>) ^ (h >>> 12);
Return h ^ (H >>> 7) ^ (H >>> 4);
}
Modulo no longer requires division.
static int indexfor (int h, int length) {
Assert Integer.bitcount (length) = = 1: "Length must be a non-zero power of 2";
Return H & (LENGTH-1);
}
As we say, hashmap because of the use of the power of 2, so in the modulo operation does not need to do division, only need the bit and the operation can be. However, due to the introduction of the hash conflict aggravating problem, hashmap after calling the object's Hashcode method, and then do some bit operations in the scattered data. This article is no longer about the problem of why these bits can be scattered. Interested in can see here.
If you read the code carefully, you can also find that HashMap and Hashtable use a variable called hashseed when calculating the hash. This is because the entry object mapped to the same hash bucket exists in the form of a linked list, and the query efficiency of the linked list is low, so the efficiency of hashmap/hashtable is very sensitive to hash collisions, so an optional hash (hashseed) can be opened. Thus reducing the hash conflict. Because this is the same point of two classes, so this article is no longer expanded, interested to see here. In fact, this optimization has been removed in JDK 1.8 because JDK 1.8, entry objects mapped to the same hash bucket (array location), is stored using a red-black tree, which greatly accelerates its lookup efficiency.
5. Thread Safety
We say that Hashtable is synchronous, HashMap is not, that is hashtable in the case of multi-threaded use, do not need to do additional synchronization, and HashMap is not. So how did Hashtable do it?
The following code and comments are from java.util.HashTable
Public synchronized V get (Object key) {
Entry tab[] = table;
int hash = hash (key);
int index = (hash & 0x7FFFFFFF)% Tab.length;
for (entry<k,v> e = Tab[index]; E! = null; e = e.next) {
if ((E.hash = = hash) && e.key.equals (key)) {
return e.value;
}
}
return null;
}
Public set<k> KeySet () {
if (KeySet = = null)
KeySet = Collections.synchronizedset (New KeySet (), this);
return keySet;
}
As you can see, it is also relatively straightforward to expose methods such as get that use the synchronized descriptor. The traversal view, such as keyset, uses collections.synchronizedxxx for synchronous packaging.
6. Code Style
Judging from my taste, HashMap's code is much cleaner than Hashtable. The following Hashtable code, I feel a bit confusing, not very acceptable to this code reuse way.
The following code and comments are from java.util.HashTable
/**
* A Hashtable Enumerator class. This class implements both the
* Enumeration and Iterator interfaces, but individual instances
* Can is created with the Iterator methods disabled. This is necessary
* To avoid unintentionally increasing the capabilities granted a user
* By passing an enumeration.
*/
Private class enumerator<t> implements Enumeration<t>, iterator<t> {
entry[] table = Hashtable.this.table;
int index = table.length;
Entry<k,v> Entry = null;
Entry<k,v> lastreturned = null;
int type;
/**
* Indicates whether this enumerator are serving as an Iterator
* or an enumeration. (True-Iterator).
*/
Boolean iterator;
/**
* The Modcount value, the iterator believes that the backing
* Hashtable should has. If this expectation is violated, the iterator
* has detected concurrent modification.
*/
protected int expectedmodcount = Modcount;
Enumerator (int type, Boolean iterator) {
This.type = type;
This.iterator = iterator;
}
//...
}
7. Hashtable has been eliminated, do not use it in the code anymore.
The following describes the class annotations from Hashtable:
If a Thread-safe implementation is not needed, it's recommended to use HASHMAP in place of Hashtable. If a thread-safe highly-concurrent implementation is desired and then it's recommended to use Java.util.concurrent.Concurren Thashmap in place of Hashtable.
Simply put, if you don't need thread safety, then use HASHMAP, if you need thread safety, then use Concurrenthashmap. Hashtable has been eliminated, do not use it in the new code.
8. Continuous optimization
Although the HashMap and Hashtable open interfaces should not change, or change infrequently. But each version of the JDK is optimized for the internal implementation of HashMap and Hashtable, such as the red-black tree optimization of JDK 1.8 mentioned above. So, as much as possible with the new version of the JDK, in addition to those cool new features, the normal API will also have a performance improvement.
Why Hashtable has been eliminated, but also to optimize it? Since the old code is still in use, the old code can get a performance boost after optimizing it.
What's the difference between HashMap and Hashtable in Java? (The original reference from the Code rural network)