Java 7源碼分析第10篇

最後更新：2014-01-13 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

可能我們經常在實際應用或面試中注重的是各個集合的特點，比如，Set集合不能儲存重複元素，不能保持插入和大小順序，Map集合儲存key-value對等等。但是，如果你想學好Java，或者目標更高一點，想要精通Java，那僅掌握這些是遠遠不夠的。其實集合中還有需多的知道需要我們去掌握和瞭解。

我們首先來看Map集合的實現而不是Set集合，因為Map和Set集合實現上非常的相似，Set中許多方法都是通過調用Map中的方法來實現功能的。為什麼呢？因為Map可以說就是一個Set，Set集合不重複且無序、Map中的key也有這個特性。當我們將Map中的value看為key的附屬時，擷取到的所有key就可以組成一個Set集合。

來看一下兩者實現的類圖：

Set 集合架構圖

Map集合架構圖

先來看一下Map介面，原始碼如下：

public interface Map<K,V> {    // Query Operations    int size();    boolean isEmpty();    boolean containsKey(Object key);    boolean containsValue(Object value);    V get(Object key);    // Modification Operations    V put(K key, V value);    V remove(Object key);    // Bulk Operations    /*       The behavior of this operation is undefined if the       specified map is modified while the operation is in progress.     */    void putAll(Map<? extends K, ? extends V> m);    void clear();    // Views    Set<K> keySet();// 由於Map集合的key不能重複，key之間無順序，所以Map集合中的所有key就可以組成一個Set集合    Collection<V> values();    Set<Map.Entry<K, V>> entrySet();    interface Entry<K,V> {        K getKey();        V getValue();        V setValue(V value);        boolean equals(Object o);        int hashCode();    }    // Comparison and hashing    boolean equals(Object o);    int hashCode();}

介面中有一個values方法，通過調用這個方法就可以返回Map集合中所有的value值；有一個keySet()方法，調用後可以得到所有Map中的 key值；調用entrySet()方法得到所有的Map中key-value對，以Set集合的形式儲存。為了能夠更好的表示這個key-value值，介面中還定義了一個Entry<K,V>介面，並且在這個介面中定義了一些操作key和value的方法。

public class HashMap<K,V>  extends AbstractMap<K,V>   implements Map<K,V>, Cloneable, Serializable{    // The default initial capacity - MUST be a power of two.    static final int DEFAULT_INITIAL_CAPACITY = 16;    static final int MAXIMUM_CAPACITY = 1 << 30;    static final float DEFAULT_LOAD_FACTOR = 0.75f;//指定負載因子    // The table, resized as necessary. Length MUST Always be a power of two.    // 使用Entry數組來儲存Key-Value，類似於ArrayList用Object[]來儲存集合元素    transient Entry[] table;    transient int size;    // The next size value at which to resize (capacity * load factor).    // HashMap所能容的mapping的極限    int threshold;    /*      * 負載因子:    */    final float loadFactor;    transient int modCount;｝

如上定義了一些重要的變數，其中的loadFactor是負載因子，增大值時可以減少Hash表（也就是Entry數組）所佔用的記憶體空間，但會增加查詢資料時時間的開銷，而查詢是最頻繁的操作，減小值時會提高資料查詢的效能，但是會增大Hash表所佔用的記憶體空間，所以一般是預設的0.75。
threshold表示HashMap所能容納的key-value對極限，如果儲存的size數大於了threshold，則需要擴容了。
hashmap提供了幾個建構函式，如下：

// 健HashMap的實際容量肯定大於等於initialCapacity，當這個值恰好為2的n次方時正好等於    public HashMap(int initialCapacity, float loadFactor) {        if (initialCapacity < 0)            throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);        if (initialCapacity > MAXIMUM_CAPACITY)            initialCapacity = MAXIMUM_CAPACITY;        if (loadFactor <= 0 || Float.isNaN(loadFactor))            throw new IllegalArgumentException("Illegal load factor: " + loadFactor);        // Find a power of 2 >= initialCapacity        int capacity = 1;        while (capacity < initialCapacity)  // 計算出大於initialCapacity的最小的2的n次方值            capacity <<= 1;        this.loadFactor = loadFactor;        threshold = (int)(capacity * loadFactor);//設定容量極限        table = new Entry[capacity];        init();    }    public HashMap(int initialCapacity) {        this(initialCapacity, DEFAULT_LOAD_FACTOR);    }    public HashMap() {        this.loadFactor = DEFAULT_LOAD_FACTOR;        threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);        table = new Entry[DEFAULT_INITIAL_CAPACITY];        init();    }    public HashMap(Map<? extends K, ? extends V> m) {        this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);        putAllForCreate(m);    }

由第一個建構函式可以看出，其實際的capacity一般是大於我們指定的initialCapacity，除非initialCapacity正好是2的n次方值。接著就來說一下HashMap的實現原理吧。在這個類中開始處理定義了一個transient Entry[] 數組，這個Entry的實現如下：

private static class Entry<K,V> implements Map.Entry<K,V> {        int hash;        K key;        V value;        Entry<K,V> next;        protected Entry(int hash, K key, V value, Entry<K,V> next) {            this.hash = hash;            this.key = key;            this.value = value;            this.next = next;        }        protected Object clone() {            return new Entry<>(hash, key, value,(next==null ? null : (Entry<K,V>) next.clone()));        }        // Map.Entry Ops        public K getKey() {            return key;        }        public V getValue() {            return value;        }        public V setValue(V value) {            if (value == null)                throw new NullPointerException();            V oldValue = this.value;            this.value = value;            return oldValue;        }        public boolean equals(Object o) {            if (!(o instanceof Map.Entry))                return false;            Map.Entry e = (Map.Entry)o;            return (key==null ? e.getKey()==null : key.equals(e.getKey())) &&               (value==null ? e.getValue()==null : value.equals(e.getValue()));        }        public int hashCode() {            return hash ^ (value==null ? 0 : value.hashCode());        }        public String toString() {            return key.toString()+"="+value.toString();        }    }

瞭解了key-value儲存的基本結構後，就可以考慮如何儲存的問題了。HashMap顧名思義就是使用雜湊表來儲存的，鏈地址法，簡單來說，就是數組加鏈表的結合。在每個數組元素上都一個鏈表結構，當資料被hash後，得到數組下標，把資料放在對應下標元素的鏈表上。當程式試圖將多個 key-value 放入 HashMap 中時，以如下程式碼片段為例：

HashMap<String , Double> map = new HashMap<String , Double>();  map.put("語文" , 80.0);  map.put("數學" , 89.0);  map.put("英語" , 78.2);

HashMap 採用一種所謂的“Hash 演算法”來決定每個元素的儲存位置。
當程式執行 map.put("語文" , 80.0); 時，系統將調用"語文"的 hashCode() 方法得到其 hashCode 值——每個 Java 對象都有 hashCode() 方法，都可通過該方法獲得它的 hashCode 值。得到這個對象的 hashCode 值之後，系統會根據該 hashCode 值來決定該元素的儲存位置。

我們可以看 HashMap 類的 put(K key , V value) 方法的原始碼：

// 當向HashMap中添加mapping時，由key的hashCode值決定Entry對象的儲存位置，當兩個 key的hashCode相同時，// 通過equals()方法比較，返回false產生Entry鏈，true時採用覆蓋行為    public V put(K key, V value) {        if (key == null)            return putForNullKey(value);        int hash = hash(key.hashCode());        int i = indexFor(hash, table.length);        for (Entry<K,V> e = table[i]; e != null; e = e.next) {            Object k;            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {                V oldValue = e.value;                e.value = value;                e.recordAccess(this);                return oldValue;            }        }        modCount++;        addEntry(hash, key, value, i);        return null;    }

當系統決定儲存 HashMap 中的 key-value 對時，完全沒有考慮 Entry 中的 value，僅僅只是根據 key 來計算並決定每個 Entry 的儲存位置。這也說明了前面的結論：我們完全可以把 Map 集合中的 value 當成 key 的附屬，當系統決定了 key 的儲存位置之後，value 隨之儲存在那裡即可。
上面方法提供了一個根據 hashCode() 返回值來計算 Hash 碼的方法：並且會調用 indexFor() 方法來計算該對象應該儲存在 table 數組的哪個索引，原始碼如下：

static int hash(int h) {        // This function ensures that hashCodes that differ only by        // constant multiples at each bit position have a bounded        // number of collisions (approximately 8 at default load factor).        h ^= (h >>> 20) ^ (h >>> 12);        return h ^ (h >>> 7) ^ (h >>> 4);    }    static int indexFor(int h, int length) {        return h & (length-1);    }

length是table.length，所以數組的長度總是 2 的 n 次方，通過h&(length-1)計算後，保證計算得到的索引值位於 table 數組的索引之內，計算的函數並不總是這樣的，好的計算函數應該還會讓索引值盡量平均分布到數組中。

當調用put()方法向 HashMap 中添加 key-value 對，由其 key 的 hashCode() 返回值決定該 key-value 對（就是Entry 對象）的儲存位置。當兩個 Entry 對象的 key 的 hashCode() 返回值相同時，將由 key 通過 eqauls() 比較值決定是採用覆蓋行為（返回 true），還是產生 Entry 鏈（返回 false），而且新添加的 Entry 位於 Entry 鏈的頭部，這是通過調用addEntry()方法來完成的，源碼如下：

void addEntry(int hash, K key, V value, int bucketIndex){     Entry<K,V> e = table[bucketIndex]; // 擷取指定 bucketIndex 索引處的 Entry     table[bucketIndex] = new Entry<K,V>(hash, key, value, e); // 將新建立的 Entry 放入 bucketIndex 索引處，並讓新的 Entry 指向原來的 Entry     // 如果 Map 中的 key-value 對的數量超過了極限    if (size++ >= threshold)             resize(2 * table.length);  //  把 table 對象的長度擴充到 2 倍}

程式總是將新添加的 Entry 對象放入 table數組的 bucketIndex 索引處——如果 bucketIndex 索引處已經有了一個 Entry 對象，那新添加的 Entry 對象指向原有的 Entry 對象（產生一個 Entry 鏈），如果 bucketIndex 索引處沒有 Entry 對象， e 變數是 null，也就是新放入的 Entry 對象指向 null，也就是沒有產生 Entry 鏈。

接下來看一下HashMap是如何讀取的。 get()方法的原始碼如下：

public V get(Object key) {        if (key == null)            return getForNullKey();        int hash = hash(key.hashCode());        // 搜尋該Entry鏈的下一個Entry，有多個Entry鏈時必須順序遍曆，降低了索引的速度        //  如果Entry鏈過長，說明發生“Hash”衝突比較頻繁，需要採用新的演算法或增大空間        for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) {                 Object k;               if (e.hash == hash && ((k = e.key) == key || key.equals(k)))                return e.value;         }       　 return null;    }

當 HashMap 的每個 bucket 裡儲存的 Entry 只是單個 Entry 時的 HashMap 具有最好的效能：當程式通過 key 取出對應 value 時，只要先計算出該 key 的 hashCode() 返回值，在根據該 hashCode 返回值找出該 key 在 table 數組中的索引，然後迴圈遍曆尋找 hash值相同,key值相同的value。

下面來看一下HashMap是如何?如下的三個方法的，原始碼如下：

 public Set<K> keySet() {        Set<K> ks = keySet;        return (ks != null ? ks : (keySet = new KeySet()));    }    public Collection<V> values() {        Collection<V> vs = values;        return (vs != null ? vs : (values = new Values()));    }    public Set<Map.Entry<K,V>> entrySet() {        return entrySet0();    }    private Set<Map.Entry<K,V>> entrySet0() {        Set<Map.Entry<K,V>> es = entrySet;        return es != null ? es : (entrySet = new EntrySet());    }

分別得到了KeySet、Values和EntrySet私人類的執行個體，那麼他們是怎麼從HashMap中取出這些值的呢？其實這裡會涉及到非常多的類和方法，大概架構如下所示：

如上類中最重要的就是HashEntry類的實現，如下：

private abstract class HashIterator<E> implements Iterator<E> {        Entry<K,V> next;        // next entry to return        int expectedModCount;   // For fast-fail        int index;              // current slot        Entry<K,V> current;     // current entry        HashIterator() {            expectedModCount = modCount;            if (size > 0) { // advance to first entry                Entry[] t = table;                while (index < t.length && (next = t[index++]) == null);// 將index指向第一個table不為null的位置            }        }        public final boolean hasNext() {            return next != null;        }        final Entry<K,V> nextEntry() {// 遍曆Entry鏈            if (modCount != expectedModCount)                throw new ConcurrentModificationException();            Entry<K,V> e = next;            if (e == null)                throw new NoSuchElementException();            if ((next = e.next) == null) {                Entry[] t = table;                while (index < t.length && (next = t[index++]) == null);            }            current = e;            return e;        }        public void remove() {            if (current == null)                throw new IllegalStateException();            if (modCount != expectedModCount)                throw new ConcurrentModificationException();            Object k = current.key;            current = null;            HashMap.this.removeEntryForKey(k);            expectedModCount = modCount;        }    }

和ArrayList一樣，在擷取到HashMap的Iterator對象後，就不可以使用ArrayList進行添加或刪除的操作了，否則會出現異常。看一下幾個重要的變數，示。

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More