"Java Source code Analysis" HashMap Source analysis

Source: Internet
Author: User
Tags rehash serialization

Definition of Class

public class HashMap<K,V>extends AbstractMap<K,V>implements Map<K,V>, Cloneable, Serializable {}

belongs to the Abstractmap subclass, with properties for clone and serialization

    The
    1. implements the map interface so that HASHMAP supports all map operations, and allows null value and null Key . HashMap and Hashtable are almost the same, the difference is two points. The point is that HashMap is a null-enabled key and value is HashMap is not thread-safe, but Hashtable is thread-safe
    2. because the hash method determines the position of an object in the HashMap collection, the operation time complexity of get and put is constant. The efficiency of an iterator iteration is related to the size of the HashMap collection itself and the number of objects loaded in the HashMap collection, so if the iterative efficiency requirements are high, it is best not to set the factor to be small or large in size. The two factors that
    3. affect hashmap performance are the factor Load-factor and the capacity capacity. The load factor describes how full the HashMap can fit, and the capacity represents the maximum number of objects that can be mounted. When the number of objects actually loaded exceeds the product of the load factor and the capacity, it is hashed, doubling the capacity
    4. Typically, the load factor is set to 0.75 to achieve better space-time efficiency. If set to a higher value, the find and add speed slows because of excessive collisions. So it is appropriate to consider how many objects are roughly loaded and how much the load factor is set up when initializing. For example, when loading a lot of objects, it is better to set a large capacity from the beginning to avoid the subsequent re-hashing and self-increment
    5. hashmap is not thread-safe, so in a multithreaded environment still need users to consider thread safety issues . This is usually done in the form of an object lock, or you can use the Collections tool class to implement Map m = collections.synchronizedmap (new HashMap (...)); , When dealing with thread-safety issues, it is best to do so at initialization time, avoiding potentially unsafe factors
    6. HashMap also has a fail-fast phenomenon, and if the iterator is created, the operation to modify the HASHMAP structure (add or delete data, etc.) will be thrown concurrentmodificationexception exception, as in the previous analysis, when the iterator appears fail-fast

A few important member variables

static final int DEFAULT_INITIAL_CAPACITY = 16; // The default initial capacity - MUST be a power of two.static final int MAXIMUM_CAPACITY = 1 << 30; // MUST be a power of two static final float DEFAULT_LOAD_FACTOR = 0.75f;transient Entry<K,V>[] table; // The table, resized as necessary. Length MUST Always be a power of two.transient int size;

The default size is 16, and the default load factor is 0.75. As noted in the note, the default initial capacity and maximum capacity and capacity after expansion must be a power of 2, summing up is that at any time, the capacity of hashmap must be a power of 2 .

constructor function

  public HashMap (int initialcapacity, float loadfactor) {if (Initialcapacity < 0) throw new illegal    ArgumentException ("Illegal initial capacity:" + initialcapacity);    if (initialcapacity > maximum_capacity) initialcapacity = maximum_capacity; if (loadfactor <= 0 | |                                           Float.isnan (Loadfactor)) throw new IllegalArgumentException ("Illegal load factor:" +    Loadfactor);    Find a power of 2 >= initialcapacity int capacity = 1;    while (capacity < initialcapacity) capacity <<= 1;    This.loadfactor = Loadfactor;    threshold = (int) math.min (capacity * Loadfactor, maximum_capacity + 1);    Table = new Entry[capacity];    usealthashing = sun.misc.VM.isBooted () && (capacity >= holder.alternative_hashing_threshold); Init ();}  

From the while loop, it can be seen that even if the given initial capacity is not a power of 2, the value of the first integer multiple of 2 greater than initialcapacity will be found as the initial capacity through the shift operation. Finally, init(); each constructor or pseudo-constructor (clone () deserialization, etc.) is called by a method that is actually an empty method and is an initialization link provided for the subclass.

public HashMap() {    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);}public HashMap(int initialCapacity) {    this(initialCapacity, DEFAULT_LOAD_FACTOR);}public HashMap(Map<? extends K, ? extends V> m) {    this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,                  DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);    putAllForCreate(m);}

The above three constructs are simple, the third one takes the default 0.75 load factor and the capacity of the data in the map can be loaded into the parameter to initialize the map

The emphasis is on the hash () function

final int hash(Object k) {    int h = 0;    if (useAltHashing) {        if (k instanceof String) {            return sun.misc.Hashing.stringHash32((String) k);        }        h = hashSeed;    }    h ^= k.hashCode();    // This function ensures that hashCodes that differ only by    // constant multiples at each bit position have a bounded    // number of collisions (approximately 8 at default load factor).    h ^= (h >>> 20) ^ (h >>> 12); // 无符号右移    return h ^ (h >>> 7) ^ (h >>> 4);}

The empty key is always mapped to the position of index 0. Note that the hash in HashMap is based on the hashcode () result of the object itself, which is mainly to avoid the hashcode () function of the key of the loaded object, which causes the hash to be uneven.

Get the hash subscript

static int indexFor(int h, int length) {    return h & (length-1);}

This is a trick, because the previous actually said that length is for even, and even the number of bits is 0, then do and operations, it is always 0. So here's a minus one operation, and the more evenly avoided values of the distribution of values are concentrated into even digits. In addition, this and the data structure described in the modulo operation is different, where the direct use and operation to replace the modulo operation, and length-1 also ensure that all values are ultimately within the length range, will not exceed the table subscript range.

Take value

public V get(Object key) {    if (key == null)        return getForNullKey();    Entry<K,V> entry = getEntry(key);    return null == entry ? null : entry.getValue();}final Entry<K,V> getEntry(Object key) {    int hash = (key == null) ? 0 : hash(key);    for (Entry<K,V> e = table[indexFor(hash, table.length)];         e != null;         e = e.next) {        Object k;        if (e.hash == hash &&            ((k = e.key) == key || (key != null && key.equals(k))))            return e;    }    return null;}private V getForNullKey() {    for (Entry<K,V> e = table[0]; e != null; e = e.next) {        if (e.key == null)            return e.value;    }    return null;}

According to the key to get the value of the first to determine whether the key is empty, if it is an empty key, then call Getfornullkey () to find, previously said that the HASHMAP implementation is implemented by the array, but each array subscript is essentially a linked list. Getfornullkey () The lookup process is straightforward, directly take table[0] This list (mentioned earlier, the data of NULL key is stored in table[0], but not table[0] only the key value corresponding to the null key pair), It then iterates through the list and returns its value if it finds a key value pair that is null for key.
If it is not a null key, then the Getentry () method to find, first based on the hash value to determine the list, and then traverse the linked list, if not found, return null.

Add action

public V put(K key, V value) {    if (key == null)        return putForNullKey(value);    int hash = hash(key);    int i = indexFor(hash, table.length);    for (Entry<K,V> e = table[i]; e != null; e = e.next) {        Object k;        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {            V oldValue = e.value;            e.value = value;            e.recordAccess(this);            return oldValue;        }    }    modCount++;    addEntry(hash, key, value, i);    return null;}

Add a key-value pair to HashMap, and if there is no corresponding key-value, create and join, or replace the old value. The return value if NULL is either because the old value is null or it may be a hashmap key that does not exist in the

private V putForNullKey(V value) {    for (Entry<K,V> e = table[0]; e != null; e = e.next) {        if (e.key == null) {            V oldValue = e.value;            e.value = value;            e.recordAccess(this);            return oldValue;        }    }    modCount++;    addEntry(0, null, value, 0);    return null;}

As can be seen from the addition of the Putfornullkey (), the actual addition of NULL key is added directly to Table[0], and in the process of adding if you find that there is already a null key, then replace, that is, null key only one

Adjust size

void Resize (int newcapacity) {entry[] oldtable = table;    int oldcapacity = Oldtable.length;        if (oldcapacity = = maximum_capacity) {threshold = Integer.max_value;    Return    } entry[] newtable = new Entry[newcapacity];    Boolean oldalthashing = usealthashing;    Usealthashing |= sun.misc.VM.isBooted () && (newcapacity >= holder.alternative_hashing_threshold);    Boolean rehash = oldalthashing ^ usealthashing;    Transfer (newtable, rehash);    Table = newtable; threshold = (int) math.min (newcapacity * loadfactor, maximum_capacity + 1);}    void Transfer (entry[] newtable, Boolean rehash) {int newcapacity = newtable.length;            for (entry<k,v> e:table) {while (null! = e) {entry<k,v> next = E.next;            if (rehash) {E.hash = NULL = = E.key? 0:hash (E.key);            } int i = Indexfor (E.hash, newcapacity);            E.next = Newtable[i];          Newtable[i] = e;  e = next;    }    }}

The existing content is re-placed into a new array, which is automatically called when there is not enough storage space. If the current capacity has reached maximum_capacity, no scaling operation occurs. Because the operation needs to reassign a larger array, and re-hash mapping, that is, the operation in the transfer, the old array of data to recalculate the hash value of the key, so it is more efficient

Delete

public V remove(Object key) {    Entry<K,V> e = removeEntryForKey(key);    return (e == null ? null : e.value);}final Entry<K,V> removeEntryForKey(Object key) {    int hash = (key == null) ? 0 : hash(key);    int i = indexFor(hash, table.length);    Entry<K,V> prev = table[i];    Entry<K,V> e = prev;    while (e != null) {        Entry<K,V> next = e.next;        Object k;        if (e.hash == hash &&            ((k = e.key) == key || (key != null && key.equals(k)))) {            modCount++;            size--;            if (prev == e)                table[i] = next;            else                prev.next = next;            e.recordRemoval(this);            return e;        }        prev = e;        e = next;    }    return e;}

Deletes the key-value pair corresponding to key, and returns null if it does not exist. First, according to the hash value of the key to get the subscript in the array table, and then record the precursor and successor, to traverse, if the traversal process encountered the specified key corresponding to the value of the pair, then delete. The deletion process is essentially a single-linked list deletion

Empty HashMap Content

public void clear() {    modCount++;    Entry[] tab = table;    for (int i = 0; i < tab.length; i++)        tab[i] = null;    size = 0;}

This implementation is relatively simple, directly truncate the connection of the slot, as for the key value of entry and entry before the connection is not empty.

Whether the specified value is included

public boolean containsValue(Object value) {    if (value == null)        return containsNullValue();    Entry[] tab = table;    for (int i = 0; i < tab.length ; i++)        for (Entry e = tab[i] ; e != null ; e = e.next)            if (value.equals(e.value))                return true;    return false;}

Returns true if one or more keys are mapped to the given value, otherwise false. Note that this is a double loop, the first one is traversing table[0-length], and the second is the linked list that traverses table[i].

Look at the mapping entities for key-value pairs

static class Entry<K,V> implements Map.Entry<K,V> {    final K key;    V value;    Entry<K,V> next;    int hash;    ......    public final boolean equals(Object o) {            if (!(o instanceof Map.Entry))                return false;            Map.Entry e = (Map.Entry)o;            Object k1 = getKey();            Object k2 = e.getKey();            if (k1 == k2 || (k1 != null && k1.equals(k2))) {                Object v1 = getValue();                Object v2 = e.getValue();                if (v1 == v2 || (v1 != null && v1.equals(v2)))                    return true;            }            return false;        }    .....}

This is an inner class, the HashMap entity is a key-value pair, and you can see a pointer to the next node in the member variable next. The criteria for determining whether the two entities are equal are also straightforward, first determining whether the key is equal, and, if so, continuing to determine whether value is equal.

Adding entities to HashMap

void addEntry(int hash, K key, V value, int bucketIndex) {    if ((size >= threshold) && (null != table[bucketIndex])) {        resize(2 * table.length);        hash = (null != key) ? hash(key) : 0;        bucketIndex = indexFor(hash, table.length);    }    createEntry(hash, key, value, bucketIndex);}void createEntry(int hash, K key, V value, int bucketIndex) {    Entry<K,V> e = table[bucketIndex];    table[bucketIndex] = new Entry<>(hash, key, value, e);    size++;}

Here is the expansion, can be seen from the third line of code, the capacity is insufficient when the expansion of the original twice times.

Iterators

private abstract class HashIterator<E> implements Iterator<E> {    Entry<K,V> next;        // next entry to return    int expectedModCount;   // For fast-fail    int index;              // current slot    Entry<K,V> current;     // current entry    ......}

In fact, the implementation of the iterator in the collection implementation of the class are similar, as long as you see one on it. This also records the number of changes expectedmodcount, that is, after the iterator is created, it is not possible to iterate during the modecount++ such operations, or appear fail-fast

Serialization and deserialization-saving and reading the state of HashMap

private void WriteObject (Java.io.ObjectOutputStream s) throws ioexception{iterator<map.entry<k,v>> i = (Size > 0)?    EntrySet0 (). iterator (): null;    Write out the threshold, loadfactor, and any hidden stuff s.defaultwriteobject ();    Write out number of buckets s.writeint (table.length);    Write out size (number of Mappings) s.writeint (size);            Write out keys and values (alternating) if (Size > 0) {for (map.entry<k,v> e:entryset0 ()) {            S.writeobject (E.getkey ());        S.writeobject (E.getvalue ()); }}}private void ReadObject (Java.io.ObjectInputStream s) throws IOException, classnotfoundexception{//Read in    The threshold (ignored), loadfactor, and any hidden stuff s.defaultreadobject (); if (loadfactor <= 0 | |                                           Float.isnan (Loadfactor)) throw new Invalidobjectexception ("Illegal load factor:" +    Loadfactor); Set Hashseed (caN only happen after VM boot) Holder.UNSAFE.putIntVolatile (this, Holder.hashseed_offset, Sun.misc.Hashing.ran    Domhashseed (this));    Read in number of buckets and allocate the bucket array; S.readint ();    Ignored//Read number of mappings int mappings = S.readint ();                                           if (Mappings < 0) throw new invalidobjectexception ("Illegal mappings count:" +    mappings); int initialcapacity = (int) math.min (///capacity chosen by number of mappings//and desired load (            If >= 0.25) mappings * Math.min (1/loadfactor, 4.0f),///We have limits ...    hashmap.maximum_capacity);    int capacity = 1; Find smallest power of which holds all mappings while (capacity < initialcapacity) {capacity <<    = 1;    } table = new entry[capacity];    threshold = (int) math.min (capacity * Loadfactor, maximum_capacity + 1); usealthashing = sun.misc.VM.isBooted () && (capacity >= holder.alternative_hashing_threshold);  Init ();    Give subclass a chance to does its thing.  Read the keys and values, and put the mappings in the HASHMAP for (int i=0; i<mappings; i++) {k key = (k)        S.readobject ();        V value = (v) s.readobject ();    Putforcreate (key, value); }}

Note that the deserialization process is more complex, equivalent to reading the basic data (size, loading factor, capacity, etc.) and examining it, creating a new map based on the data and remapping the key-value pairs read from the stream to HashMap

"Java Source code Analysis" HashMap Source analysis

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.