1 Set and storage order
Set
the added element must define equals()
a method to ensure the uniqueness of the object.
hashCode()
Only this class is required when it is placed HashSet
or LinkedHashSet
in. But for a good programming style, you should always overwrite the hashcode () method When overriding the Equals () method.
If an object is used in a sort container of any kind, for example SortedSet
( TreeSet
its only implementation), then it must Comparable
implement the interface.
Note that SortedSet
This means "sort elements by their comparison function " rather than "the order in which elements are inserted." Insert Order LinkedHashSet
to save.
2 queues
Team. Concurrent applications, the only two implementations of the queue in Java SE5 are LinkiedList
, PriorityQueue
and they only have the difference in sorting behavior , with no difference in performance.
PriorityQueue
the ordering of priority queues is also controlled by Comparable
implementation.
3 Map
The mapping table (also known as the associative array associativearray).
3.1 Performance
HashMap
A special value, called a hash code , is used instead of a slow search for keys. A hash code is a "relatively unique" value used to represent an object that int
is generated by converting some information about the object.
Hashcode () is a method in the root class object, so all objects can produce hash codes.
The requirements for the keys used in the map are the same as for the elements in set:
Any key must have a equals()
method;
If the key is used to hash the map, then it must also have hashCode()
the appropriate method;
If the key is TreeMap
used, then it must Comparable
be implemented.
4 Hashing and hashing codes
HashMap use equals()
to determine whether the current key is the same as the key that exists in the table.
the default object.equals () is simply the address of the comparison object . If you want to use your own class as the key for HashMap , you must both override hashCode()
and equals()
.
The correct equals()
method must meet the following 5 conditions:
4.1 Hashing Concepts
The
uses hashing to: you want to use one object to find another object . The
map's implementation class uses hashing to to increase query speed .
the value of the hash is speed : hashing makes the query fast . Because the bottleneck is at the query speed , one solution is to keep the key sort state and then use Collections.binarySearch()
the query.
The hash is further, and it saves the key somewhere so that it can be found quickly . The fastest data structure for storing a set of elements is an array, so it is used to represent the key information (be careful, I mean the key information, not the key itself). But since arrays cannot adjust capacity, there is a problem: we want to keep an indeterminate number of values in the map, but what if the number of keys is limited by the capacity of the array?
The answer is: The array does not hold the key itself . Instead, a number is generated from the key object, which is used as the subscript for the array. This number is a hash code , generated by a hashCode()
method defined in object that may be overridden by your class (called a hash function in the terminology of computer science).
To solve the problem of fixed array capacity, different keys can produce the same subscript. In other words, there may be conflicts , that is, the hash code does not have to be unique . Therefore, the size of the array is not important, and any key can always find its position in the array.
4.2 Understanding hashing
In summary, the hash is to save an object to generate a number (as the subscript of the array), and then find the object when the number is directly found, so the purpose of the hash is to improve the search speed, and means is to associate and save the number generated by an object (through an array, called a hash table ). The generated number is the hash code . The method of generating this hash code is called the hash function ( hashCode()
).
4.3 hashmap query process (fast cause)
Therefore, the HashMap
process of querying one key
is:
First calculate the hash code
Then use the hash Code query array (hash code as variable group subscript)
If there is no conflict, that is, the object generating this hash code is only one, then the hash code corresponding to the array subscript position is the element to be found
If there is a conflict, the array element that corresponds to the hash code holds one list
, and then uses the list
method to query the value in a equals()
linear way.
Therefore, instead of querying the whole list
, you quickly jump to a location in the array and compare only a few elements . That HashMap
is why it is so fast .
4.4 Implementation of a simple hash map
A slot (slot) in a hash table is often referred to as a bucket bit (bucket)
To make the hash uniform, the number of buckets is usually prime (Prime in JDK5, which is already the integer number of 2 in the JDK7).
It turns out that prime numbers are not really the ideal capacity for hash barrels. Recently, the hash function of Java (through extensive testing) uses 2 of the whole number of times . For modern processors, division and finding the remainder are the slowest operations. Using a 2 hash of the entire number of square lengths, the
mask can be used instead of division. Because get () is the most frequently used operation, the% operation of the remainder is the most expensive part, and using a whole number of 2 can eliminate this overhead (and may have some effect on hashcode ()).
The Get () method calculates the index in the buckets array in the same way as the put () method, which is important because it guarantees that two methods can calculate the same position .
Package Net.mrliuli.containers;import java.util.*;p ublic class simplehashmap<k, v> extends Abstractmap<k, V > {//Choose a prime number for the hash table size, to achieve a uniform distribution:static final int size = 997; You can ' t has a physical array of generics, but can upcast to one: @SuppressWarnings ("unchecked") LinkedList <mapentry<k,v>>[] Buckets = new Linkedlist[size]; @Override public V put (K key, V value) {int index = Math.Abs (Key.hashcode ())% SIZE; if (buckets[index] = = null) {Buckets[index] = new linkedlist<mapentry<k,v>> (); } linkedlist<mapentry<k,v>> bucket = Buckets[index]; Mapentry<k,v> pair = new mapentry<k,v> (key, value); Boolean found = false; V oldValue = null; Listiterator<mapentry<k,v>> it = Bucket.listiterator (); while (It.hasnext ()) {mapentry<k,v> Ipair = It.next (); if (Ipair.equals (key)) {OldValue = Ipair.getvalue (); It.set (pair); Replace old with new found = true; Break }} if (!found) {buckets[index].add (pair); } return oldValue; } @Override public V get (Object key) {int index = Math.Abs (Key.hashcode ())% SIZE; if (buckets[index] = = NULL) return null; For (mapentry<k,v> Ipair:buckets[index]) {if (Ipair.getkey (). Equals (key)) {return ipair.ge TValue (); }} return null; } @Override Public set<map.entry<k,v>> EntrySet () {set<map.entry<k,v>> Set = new Hash Set<map.entry<k, v>> (); for (linkedlist<mapentry<k,v>> bucket:buckets) {if (bucket = = null) continue; for (mapentry<k,v> mpair:bucket) {set.add (Mpair); }} RetuRN set; } public static void Main (string[] args) {simplehashmap<string, string> m = new simplehashmap<string, S Tring> (); For (String S: ' To be ' or ' not ' is a question ". Split (") ") {M.put (S, s); System.out.println (m); } System.out.println (M); System.out.println (M.get ("be"); System.out.println (M.entryset ()); }}
4.5 Cover Hashcode ()
Factors to consider when designing ' hashcode () ':
The most important factor: whenever you call Hashcode () on the same phase object, you should generate the same value .
In addition, hashcode () should not be dependent on unique object information, especially with the value of this, which can only produce very bad hashcode (). Because doing so does not generate a new key, it is the same as the key in the original key-value pair in the put (). That is, you should use meaningful identifying information within the object. That is, it must generate a hash code based on the contents of the object.
However, the identity of the object must be fully determined by hashcode () equals ().
Because Hashcode () requires further processing before generating the subscript for a bucket, the generated range of the hash code is not important, as long as it is an int.
A good hashcode () should produce a hash code that is evenly distributed.
The effective Java Programming Language Guide (Addison-wesley, 2001) gives a basic guidance on how to write a decent hashcode ():
Assigns int
a variable to result
a non-0-value constant, such as17
Calculates a hash code for each meaningful field within an object f
(that is, every field that can be equals()
manipulated) int
c
:
Domain Type |
Calculation |
Boolean |
C= (f?0:1) |
Byte, char, short, or int |
c= (int) F |
Long |
c= (int) (f^ (f>>>32)) |
Float |
C=float.floattointbits (f); |
Double |
Long L = double.doubletolongbits (f); |
Object whose equals () call Equals () of this domain |
C=f.hashcode () |
Array |
Apply the above rules to each element |
3. Consolidate hash codes: result = 37 * result + c;
4. Return result.
5. Check the hashCode()
results of the last generation to ensure that the same hash code is available for the same object.
5 Choosing the implementation of different interfaces
5.1 Risk of micro-benchmark testing (microbenchmarking dangers)
has been shown to 0.0
be contained in Math.random()
the output, in terms of mathematical terminology, i.e. its scope is [0,1)
.
Performance factors of 5.2 hashmap
Some of the terms in HashMap:
Capacity (capacity): The number of bucket positions in the table (the amount of buckets in the table).
Initial capacity (Initial capacity): The number of bucket bits that the table owns at the time of creation. HashMap
and HashSet
both have constructors that allow you to specify the initial capacity.
size (size): The number of items currently stored in the table.
load factor (loadfactor): Size/capacity. The load factor for an empty table is, and the 0
load factor for the half-full table is 0.5
, and so on. A light-loaded table is less likely to conflict, so it is ideal for insertions and lookups (but slows down the process of iterating with iterators). HashMap
and HashSet
both have constructors that allow you to specify a load factor, which means that when the load condition reaches the level of the load, the container will automatically increase its capacity (number of buckets) by roughly doubling the capacity and re-distributing the existing objects into the new bucket set (this is known as re-hashing ).
HashMap
The default load factor used is 0.75
(re-hashing only when the expression is three-fourths full), which balances time and space costs. A higher load factor can reduce the space required by the table but increases the lookup cost, which is important because lookups are the operations (including and) that we do most of the time get()
put()
.
6 synchronization control for collection or map
The collections class has a way to automatically synchronize the entire container. Its syntax is similar to the "non-modifiable" method:
Package Net.mrliuli.containers;import java.util.*;p ublic class Synchronization {public static void Main (string[] args) { collection<string> c = collections.synchronizedcollection (new arraylist<string> ()); list<string> list = collections.synchronizedlist (new arraylist<string> ()); set<string> s = collections.synchronizedset (new hashset<string> ()); set<string> SS = Collections.synchronizedsortedset (New treeset<string> ()); Map<string, string> m = collections.synchronizedmap (new hashmap<string, string> ()); map<string, string> sm = collections.synchronizedsortedmap (new treemap<string, string> ());} }
6.1 Quick Error (FAIL-FAST)
The Java container has a protection mechanism that prevents multiple changes to the same container at the same time. The Java Container Class library uses the Fast error (Fail-fast) mechanism. It will probe for any changes in the container other than what your process is doing, and once it finds that other processes have modified the container, it throws ConcurrentModificationException
an exception immediately. This is the meaning of " Quick error "-that is, not using complex algorithms to check the problem afterwards.
Package Net.mrliuli.containers;import java.util.*;p ublic class FailFast {public static void Main (string[] args) { c6/>collection<string> C = new arraylist<> (); Iterator<string> it = C.iterator (); C.add ("an Object"); try{ String s = it.next (); } catch (Concurrentmodificationexception e) { System.out.println (e);}} }
Related articles:
Java programming thought Lesson (iii) 15th-generics
Java Programming Thought Lesson (v) 18th Chapter-java IO system