Java hashmap is very commonly used. This article studies its implementation algorithm, and finally hopes to calculate the quantitative data of memory usage and performance, and then draws a conclusion on when hashmap is used and when it cannot be abused.
Hashmap is actually an array, and each element in the array is a linked list. When each element is put into a hashmap using the put method, perform the following steps:
1. Calculate the hash value based on the hashcode provided by the element. The hash value is the subscript of the array.
2. Add new elements to the linked list at the position of the array.
Let's take a look at the definition of Arrays:
[Java]View plaincopyprint?
- /**
- * The table, resized as necessary. length must always be a power of two.
- */
- Transient entry [] table;
This is an array. The transient keyword tells us that it will not participate in serialization. Since it is an array, there is always an upper limit, which means that if too many elements are stored in the hashmap, the array size must be adjusted when the array size cannot store all linked lists. First, let's look at the algorithms related to array capacity.
First, what type of entry?
[Java]View plaincopyprint?
- Static class entry <K, V> implements map. Entry <K, V> {
- Final K key;
- V value;
- Entry <K, V> next;
- Final int hash;
- /**
- * Creates new entry.
- */
- Entry (int h, K, V v, entry <K, V> N ){
- Value = V;
- Next = N;
- Key = K;
- Hash = h;
- }
- ....
- Public final Boolean equals (Object O ){
- If (! (O instanceof map. Entry ))
- Return false;
- Map. Entry E = (Map. Entry) O;
- Object k1 = getkey ();
- Object k2 = E. getkey ();
- If (k1 = k2 | (K1! = NULL & k1.equals (K2 ))){
- Object V1 = getvalue ();
- Object v2 = E. getvalue ();
- If (V1 = V2 | (V1! = NULL & v1.equals (V2 )))
- Return true;
- }
- Return false;
- }
- Public final int hashcode (){
- Return (Key = NULL? 0: Key. hashcode () ^
- (Value = NULL? 0: value. hashcode ());
- }
- ....
This is an internal static class of the hashmap class. Implemented the map. entry interface. Two template parameters K and V are accepted. Once the key and hash are initialized in the constructor, they cannot be changed. Because of the existence of next, the entry can constitute a one-way linked list.
More importantly, the equals and hashcode methods. The code is listed first and then explained later.
Second, set the initial capacity
Most of them are in the following constructor. The specified initialcapacity cannot be less than 0 or exceed the maximum value. And the final capicity must be the Npower of 2. If you use a non-parameter constructor, an array with 16 elements is created by default.
[Java]View plaincopyprint?
- Public hashmap (INT initialcapacity, float loadfactor ){
- If (initialcapacity <0)
- Throw new illegalargumentexception ("illegal initial capacity:" +
- Initialcapacity );
- If (initialcapacity> maximum_capacity)
- Initialcapacity = maximum_capacity;
- If (loadfactor <= 0 | float. isnan (loadfactor ))
- Throw new illegalargumentexception ("illegal load factor:" +
- Loadfactor );
- // Find a power of 2> = initialcapacity
- Int capacity = 1;
- While (capacity <initialcapacity)
- Capacity <= 1;
- This. loadfactor = loadfactor;
- Threshold = (INT) (capacity * loadfactor );
- Table = new entry [capacity];
- Init ();
- }
Third, when should I adjust the array size?
The algorithm is like this. There is a variable size that saves the number of elements used by the actual array, and if the value of size reaches the value of the variable threshold, the array capacity must be expanded. Threshold = capicity * loadfactor. capicity is the maximum number of elements contained in the array. loadfactor can be specified in the constructor; otherwise, the default value is 0.75f. The maximum value of capicity is 1 <30 (that is, the power of 2 to the power of 30, 1073741824). We can see that hashmap can store a maximum of 1 billion linked lists.
Fourth, how to adjust the array size?
The answer is 2X like the vector allocation policy in C ++.
[Java]View plaincopyprint?
- Void addentry (INT hash, K key, V value, int bucketindex ){
- Entry <K, V> E = table [bucketindex];
- Table [bucketindex] = new entry <K, V> (hash, key, value, e );
- If (size ++> = threshold)
- Resize (2 * Table. Length );
- }
Fifth, Why must the array size be a multiple of 2?
This will be answered later when we introduce the hash algorithm.