Objective
Previously, the collection framework was divided into collection and map, mainly based on the storage content is a single row and a double column, in fact, so that the distinction is not correct, set is actually a double-column structure.
Now look back at the set frame and see a lot of things that I couldn't see.
Now look at the set frame, part of list, part of set and Map,set and map is almost the same thing.
This article assumes that you have a certain understanding of the collection framework, and see the collection framework and map basics for details.
First, the data structure
In fact, I can't speak much in depth.
The data structure is the relationship of a bunch of numbers.
Logical Structure --the relationship between data logic is actually data structure, and the logical structure of data can be divided into almost four kinds: linear structure, set structure, tree structure and graph structure.
Physical Structure -the four logical structures, either of which, ultimately, are to be saved to physical memory, that is, the logical structure is based on the physical structure, and the physical structure of the data is no different from the two: sequential storage structure and chain storage structure. Sequential storage structure, common is the array, need a piece of contiguous memory, chained storage structure, do not need contiguous memory, but the previous data object needs to correlate the memory address of the next data object.
Ii. HashSet and HashMap
The bottom of the hashset is hashmap,hashmap with arrays, linked lists, and self-balancing binary trees, but mostly arrays, so hashset and HashMap are one thing (do not look at the source later), and are primarily based on sequential storage structures.
1. Relationship
HashSet used the HashMap, but only used his key, each time to set the data is stored on the key, and value is all stored is a private static final object object.
public class hashset<e> extends abstractset<e> implements Set<e>, Cloneable, java.io.serializable{ static final Long Serialversionuid = -5024744406713321676l;//hashset uses HashMap, All value is actually stored in the map key location private transient hashmap<e,object> map;//and value is always a static Final Object Object PRESENT private static Final Object PRESENT = new Object (); Public HashSet () { map = new hashmap<> (); } add element//element at key position, and map's value position is a never-changing object Objects public boolean add (e e) {return Map.put (E, PRESENT) ==null; }}
The KV entities in HashMap are mainly stored in the array structure.
public class Hashmap<k,v> extends abstractmap<k,v> implements Map<k,v>, Cloneable, Serializable { private static final long Serialversionuid = 362498820763181265l;//actually a kv is wrapped into a node, And all of the Node is stored in the table array. Transient node<k,v>[] TABLE;//MAP kv structure definition static class Node<k,v> implements Map.entry <K,V> { final int hash; Final K key; V value; Node<k,v> Next; Node (int hash, K key, V value, node<k,v> next) { This.hash = hash; This.key = key; This.value = value; This.next = next; } Method omitted }}
2. Hashing
Why is the hash fast? In the case of a given array size enough, his additions and deletions to the performance are O (1), which is mainly a hash method, the object's eigenvalues mapped to an array subscript, the focus is different objects based on the hash method to produce different mapping values, otherwise it will degenerate into the performance of the list. The principle of a few words, but the most critical of the implementation of the hashing method, the algorithm engineers have to study the content. I don't have enough, I can't talk too much.
or by the way, the key method of hashing hashcode and equals, these are the basic knowledge, please know for yourself, thank you.
HashMap implementation of the hash method, just produce a hash value, the final map array subscript in the Put method is completed, this part please see the third section, linked list and two fork tree in the source code posted.
Static final int hash (Object key) {int H;return (key = = null)? 0: (H = key.hashcode ()) ^ (h >>> 16);}
3. Linked list and two-fork tree
Since it is based on an array, you cannot avoid dynamic scaling. HashMap source has dynamic expansion of content, please self-viewing source code, and the hash of JDK8 in the problem of hashing, no longer just use the list to solve (here is a single-linked list, not like the LinkedList using a double-linked list), but also according to the conflict evolved to a certain scale, A self-balancing binary tree is used to replace the linked list for better performance, from time complexity O (n) to O (Logn). Here too many source code is not good to say, or the old saying, self-viewing source (the following excerpt inserted part of the core source code), haha ~
Final V putval (int hash, K key, V value, Boolean onlyifabsent, Boolean evict) {node<k,v>[] tab; Node<k,v> p; int N, i;//assigns a reference to the table to tab//if it is the first time a KV element is placed, initialize the Lengthif of the table after the table//n= initialization is complete (tab = table) = = NULL | | (n = tab.length) = = 0) n = (tab = resize ()). The hash entered by length;//has been calculated once//the hash value is mapped to the array subscript and assigned to p//if the location has not been assigned, The newly added KV is encapsulated into node and stored with the position if (p = tab[i = (n-1) & hash] = = null) tab[i] = NewNode (hash, key, value, NULL);//If this position has been assigned , then there is a hash conflict, which requires resolving the conflict problem else {node<k,v> e; K k;//If the existing node hash value in that location is the same as the hash value of the current incoming key (note difference, hash () method)//And key is the same//description is the same key, then it is not new to add kv, but instead of replacing the old value//here is just the old node P assigns to E, modifies value at the end of the method if (P.hash = = Hash && (k = p.key) = = Key | | (Key! = null && key.equals (k)))) E = p;//hash conflict, enter the self-balancing binary tree (red-black tree) logic, according to the hash value to do the left and right child's decision else if (p instanceof TreeNode) e = ((treenode<k,v>) p). Puttreeval (This , tab, hash, key, value),//hash conflict, enter the list logic else {for (int bincount = 0;; ++bincount) {//If the location is the first occurrence of a hash conflict, the new KV is encapsulated as node, placed in the old nod After e, hang up the list if ((e = p.next) = = null) {//list resolves the conflict problem P.Next = NewNode (hash, key, value, NULL);//If the size of the hash conflict at the same location reaches the threshold of treeify_threshold-1 (7)//Then the list is turned into a self-balancing binary tree if (Bincount >= TREEIFY_THRESHOLD-1)//-1 for 1sttreeifyBin (tab, hash); If there is an old node on the list, then it is assigned, and finally replaced valueif (E.hash = = Hash && (k = e.key) = = Key | | (Key! = null && key.equals (k)))) Break;p = e;}} If it is old node, then replace Valueif (E! = null) {//existing mapping for keyv OldValue = e.value;if (!onlyifabsent | | oldValue = NUL L) E.value = Value;afternodeaccess (e); return oldValue;}} ++modcount;if (++size > Threshold) resize (); afternodeinsertion (evict); return null;}
4. Adding and deleting changes
As for the performance of its additions and deletions, based on the hash, regardless of the dynamic expansion of the premise, are O (1), but if a hash conflict, will be local loss of performance to the list or self-balanced binary tree, but the performance of the self-balanced binary tree is better than the linked list, which is the Java 8 Code optimization performance.
Iii. TreeSet and TreeMap
TreeSet is based on the bottom of the treemap,treemap is the chain storage results, the logical structure is the tree, self-balancing binary tree, or red-black tree.
1. Relationship
TreeSet uses treemap in the adorner mode, they both implement the same interface Navigablemap, and in TreeSet, they hold the navigablemap reference, in fact the TreeSet will be treemap when it is constructed.
public class Treeset<e> extends abstractset<e> implements Navigableset<e>, Cloneable, java.io.serializable{ private transient navigablemap<e,object> m;//Ultimately, no matter which constructor method is called, it is eventually transferred to the constructor method, Complete initialization of TreeMap TreeSet (navigablemap<e,object> m) { this.m = m; } Public TreeSet () {This (new treemap<e,object> ()); } Public TreeSet (COMPARATOR<, Super e> Comparator) {This (new treemap<> (Comparator)); } Public TreeSet (collection<. extends e> c) {this (); AddAll (c); } Public TreeSet (sortedset<e> s) {This (S.comparator ()); AddAll (s); }}
Then look at the list structure in the TREEMAP definition, Entry, which left and right respectively for the child and the child.
Static Final class Entry<k,v> implements map.entry<k,v> {K key; V value; Entry<k,v> left; Entry<k,v> right; entry<k,v> Parent;boolean color = black;/** * Make a new cell with given key, value, and parent, and with * {@code n ULL} child links, and BLACK color. */entry (K key, V value, entry<k,v> parent) {This.key = Key;this.value = Value;this.parent = parent;} /** * Returns the key. * * @return The key */public K GetKey () {return key;} /** * Returns The value associated with the key. * * @return The value associated with the key */public V GetValue () {return value;} /** * Replaces the value currently associated with the key with the given * value. * * @return The value associated with the key before this method is * called */public v setValue (v value) {v OLDV Alue = This.value;this.value = Value;return oldValue;} public boolean equals (Object o) {if (! ( o instanceof Map.entry)) return false; map.entry<?,? > E = (map.entry<?,? >) O;return valequals (Key,e.getkey ()) && valequals (Value,e.getvalue ());} public int hashcode () {int keyhash = (key==null? 0:key.hashcode ()); int valuehash = (value==null? 0:value.hashcode ()) ; return keyhash ^ valuehash;} Public String toString () {return key + "=" + Value;}}
2. The sort implementation sort is only for key (set is only using the key of the map), can be sorted by natural (the object stored in the collection frame implements the Compareble interface, its own comparison) and custom sort (the collection frame has its own comparator, Implementation of the Compartor interface), which is also the basic Knowledge section, see the collection framework and map basics, which are not mentioned here.
Four, performance comparison
In fact, compared with the performance of the hash and red-black tree performance, the hash performance is obviously excellent (close to O (1), while the Red black Tree is O (logn)), but the hash is fast but the elements are unordered, red and black trees, although slow but can achieve orderly, specific or according to business scenarios.
Note:
If there are mistakes in this article, please do not hesitate to correct, thank you!
Stay up late to write, thinking is not very clear bird, forgive ~
"Java" Java Collection framework brief analysis of source code and data structures--set and map