Java Hashtable Data structure

Source: Internet
Author: User
Tags rehash concurrentmodificationexception

Objective:

See Hashtable source today, began to think Hashtable is a entry[]

int hash = Key.hashcode (); int index = (hash & 0x7FFFFFFF)% tab.length;  for null ; E = e.next) {    if ((E.hash = = hash) && e.key.equals (key))     {= e.value;< c12/>= value;      return Old ;     } }

When you see the above code, the naïve thought is to add a chain relationship on entry[], as shown ( this is wrong ):

Pepsi cannot be solved,int index = (hash & 0x7FFFFFFF)% tab.length; What's the use of this?

Text: reproduced in http://www.blogjava.net/fhtdy2004/archive/2009/07/03/285330.html

Java Hashtable Analysis

The structure of Hashtable is a method of dealing with the conflict by means of the chain address method in the data structure.

From the above structure, it can be seen that the essence of Hashtable is an array + linked list. The entry in the graph is the implementation of the linked list, and the entry structure contains a reference to another instance of itself next, which points to another entry. The number in the figure is a entry array, and the number is the index of the entry array. When you add a key pair to Hashtable, the index is determined by the hashcode of the key and the length of the entry array, which determines where the key-value pair is stored in the entry array. In this sense, when the key is certain, the length of the entry array must be the case, the resulting index must be the same, that is, the insertion order should not affect the order of the output. However, there is one important factor that is not considered, that is, the case where the index appears to have the same value. For example, in the code "Sichuan" and "Anhui", the resulting index is the same, at this time, the function of the list of entry functions: The Put method through the entry next property to obtain a reference to another entry, and then put the later into it. According to the results of debug, "Sichuan", "Anhui" index is the same as 2, "Hunan" index is 6, "Beijing" index is 1, when the output, the index will be decremented to obtain the key value pair. It is clear that the output sequence that will change is only "Sichuan" and "Anhui", which means that there are only two possible outputs: "Hunan"-"Sichuan"-"Anhui"-"Beijing" and "Hunan"-"Anhui"-"Sichuan"-"b Eijing ". The following is the result of Hashtable after running the sample code:


In the implementation code of Hashtable, there is a method called rehash to augment the capacity of the Hashtable. Obviously, when the rehash method is called, each key value changes to the corresponding index, which is equal to reordering the key value pairs. This is also the reason for placing the same key-value pair into different Hashtable to output different key-value pairs for the sequence. In Java, the conditions for triggering the rehash method are simple: The key-value pairs in the hahtable exceed a certain threshold. By default, this threshold is equal to the length of the entry array in Hashtable x0.75.

Since the Java 2 Platform v1.2, this class has been improved to be able to implement MAP, so it becomes part of the Java collections Framework. Unlike the implementation of the new collection, Hashtable is synchronous.

The Listiterator method of the Iterator returned by the iterator and the collection returned by all Hashtable's collection view methods is a quick failure: After the Iterator is created, if the hashtab is structurally Le is modified, and Iterator will throw concurrentmodificationexceptionunless it is modified in any way by Iterator its own removal or addition method. therefore, in the face of concurrent changes, the Iterator will soon fail completely without risking any uncertain behavior at some uncertain time in the future. The enumeration returned by the Hashtable key and value method is not a quick failure.

Note that the fast failure behavior of iterators is not guaranteed because, in general, it is not possible to make any hard guarantees as to whether or not there is a concurrency change in sync. A fast-failing iterator will do its best to throw concurrentmodificationexception. Therefore, it is a bad practice to write a program that relies on this exception to improve the correctness of such iterators: The fast failure behavior of iterators should only be used to detect program errors.

testhashtableSeveral inner classes are defined in Hashtable (including static nested classes and ordinary inner classes)

The entry data structure in Hashtable
Entry
Put method: Key hash value is different but may put in the same index, and need to judge before putting in
Put method
1 Public Synchronized enumeration<k> keys (){
2 return this.<k> getenumeration (KEYS);
3}
4
5 Private <T> Enumeration<t> GetEnumeration ( int type){
6 if (count = = 0){
7 return (enumeration<t>) Emptyenumerator;
8} else{
9 return new enumerator<t> (type, False );
10}
11}
12
Public synchronized enumeration<v> elements (){
Return this.<v> getenumeration (VALUES);
15}
16
17Enumerator is an internal class defined by Hashtable
The private class enumerator<t> implements Enumeration<t> iterator<t>{
19 entry[] Table = Hashtable.this . table;//accessing member variables of the host class
int index = table.length;
Entry<k,v> Entry = null;
Entry<k,v> lastreturned = null;
an int of type;
24
25/**
* Indicates whether this enumerator are serving as an Iterator
* or an enumeration. (True-Iterator).
28 */
Boolean iterator;
30
31/**
* The Modcount value, the iterator believes that the backing
* Hashtable should has. If this expectation is violated, the iterator
* has detected concurrent modification.
35 */
36 protected int expectedmodcount = Modcount;
37}
38 A method for accessing the entry array in Hashtable is provided in the inner class.
39 in the implementation of the method in the iterator interface using the Expectedmodcount variable to determine whether there is concurrent modification resulting in fast-fail, and in the enumeration interface method implementation is not judged
40
41
42
Public set<k> KeySet (){
if (KeySet = = null)
KeySet = Collections.synchronizedset (new KeySet (),this);
KeySet return;
47}
$ private Class KeySet extends abstractset<k>{
49 Public iterator<k> Iterator () {
Return
Getiterator (KEYS);
Wuyi}

The public int size (){
The return count;
54}
Public Boolean contains (Object o){
ContainsKey return (o);
57}
+/-Public boolean remove (Object o){
59 return Hashtable.this.remove (o)! = null ;
60}
public void Clear (){
62 Hashtable. This . Clear ();
63}
64}
65 an implementation of the Iterator interface method in the internal class keyset, called the Getiterator (KEYS) of the host class
<T> iterator<t> getiterator (int type){
if (count = = 0){
(iterator<t>) Emptyiterator;
---} else{
enumerator<t> return new (type, true);
71}
72}
In 73getIterator, a new inner class enumerator object is used to access the entry array of Hashtable using enumerator to create an instance of an inner class directly in the inner class???
74
75
76
Collection<v> VALUES (){
if (values==null)
Values = collections.synchronizedcollection (new ValueCollection (),
(this);
Bayi return values;
82}
83 ValueCollection is also an internal class, structure and keyset function almost
Public set<map.entry<k,v>> EntrySet (){
if (entryset==null)
EntrySet = Collections.synchronizedset (New EntrySet (), this);
EntrySet return;
88}
The EntrySet is also the inner class, the structure and the keyset function are similar
Posted on 2009-07-03 13:24 Frank_fang Read (5958) Comments (1) Edit Favorites Category: Java programming


Comments:# Re:java Hashtable analysis 2009-07-15 00:11 | Frank_fang
hashmap<k,v>extends abstractmap<k,v> implements Map<k,v>, Cloneable, Serializable

The implementation of a hash table-based Map interface. This implementation provides all the optional mapping operations and allows null values and null keys to be used. (in addition to not synchronizing and allowing Nulls, theHashMap class is roughly the same as Hashtable .) This class does not guarantee the order of the mappings, especially because it does not guarantee that the order is constant.

This implementation assumes that the hash function distributes elements correctly between buckets, providing stable performance for basic operations (get and put). The time that is required to iterate the collection view is proportional to the "capacity" (the number of buckets) of the HashMap instance and its size (number of key-value mappings). Therefore, if iteration performance is important, do not set the initial capacity too high (or set the load factor too low).

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor . capacity is the number of buckets in the hash table, and the initial capacity is just the capacity at the time of creation of the Hashtable. A load factor is a scale in which a hash table can reach a full amount before its capacity increases automatically. Doubles the capacity by calling the rehash method when the number of entries in the hash table exceeds the product of the load factor to the current capacity.

Typically, the default load factor (. 75) seeks a tradeoff in time and space costs. The high load factor, while reducing the space overhead, also increases the query cost (which is reflected in the operations of most HashMap classes, including get and put operations). When setting the initial capacity, you should take into account the number of entries required in the mapping and their loading factors in order to minimize the number of rehash operations. The rehash operation does not occur if the initial capacity is greater than the maximum number of entries divided by the load factor.

If many of the mapping relationships are to be stored in the HashMap instance, creating it with a large enough initial capacity will make it more efficient to store the mapping relationship relative to the automatic rehash operation on demand to increase the capacity of the table.

Note that this implementation is not synchronous. If multiple threads access this mapping at the same time, and at least one of the threads modifies the mapping from the fabric, it must remain externally synchronized. (Structural modifications are actions that add or remove one or more mapping relationships; changing only the values associated with the key that the instance already contains are not structural modifications.) This is typically done by synchronizing the objects that naturally encapsulate the mapping. If such an object does not exist, you should use the Collections.synchronizedmap method to "wrap" the map. It is a good idea to do this at creation time to prevent unintended, out-of-sync access to the mappings, as follows:

Map m = collections.synchronizedmap (new HashMap (...));

The iterator returned by all of the collection view methods of this class is a quick failure : After the iterator is created, if the mappings are modified from the structure, except through the remove or add method of the iterator itself, Any other change in any way at any time, the iterator will throw concurrentmodificationexception. Therefore, in the face of concurrent modifications, the iterator will soon fail completely, without risking any uncertain behavior at any time in the uncertain future.

Note that the fast failure behavior of iterators cannot be guaranteed and, in general, there is no firm guarantee when there are concurrent changes that are not in sync. The fast-failing iterator does its best to throw concurrentmodificationexception. Therefore, it is wrong to write a method that relies on this exception program: The Fast failure behavior of iterators should only be used to detect program errors.



linkedhashmap<k,v>extends hashmap<k,v> implements Map<k,v>

The hash table and link list implementations of the Map interface have a predictable sequence of iterations. The difference between this implementation and HashMap is that the latter maintains a double-link list that runs on all items. This list of links defines the order of iterations, which is usually the order in which the keys are inserted into the map (in order of insertion ). Note that if you reinsert the key in the map, the insertion order is not affected. (If m.containskey (k) returns truebefore calling M.put (k, v) , the key K is reinserted into the mapping m when called. )

This implementation allows customers to avoid unspecified, HashMap Hashtable often disorganized, sorting jobs, without increasing the TreeMap cost associated with them. It can be used to generate a mapped copy of the same order as the original, regardless of the original mapping implementation:

     void foo (map m) {Map copy = new Linkedhashmap (m); ...}
This technique is especially useful if the module obtains a mapping by input, copies the mapping, and then returns the result of determining its order by this copy. (customers typically expect to return content in the same order as they appear.) )

Provides a special 构造方法 way to create a link hash map, the iteration order of the hash map is the order of the last access to its entries, from the least recent access to the most recently visited order ( Access Order ). This mapping is good for building LRU caches. Calling the put or get method accesses the corresponding entry (assuming it still exists after the call is completed). The Putall method generates an entry access for each mapping relationship that specifies a mapping by specifying the order of the key-value mappings provided by the mapped entry collection iterator. No other method generates entry access. in particular, operations on the collection view do not affect the iteration order of the underlying mappings.

You can override the removeEldestEntry(Map.Entry) method to enforce the policy so that the old mapping relationship is automatically removed when the new mapping relationship is added to the map.

This class provides all the optional Map operations and allows null elements. Like HashMap , it provides stable performance for basic operations (add,contains , and remove), assuming that the hash function correctly distributes elements into buckets. Because of the increased expense of maintaining the list of links, the performance is likely to be lesser than HashMap , but this is an exception: the time required forLinkedhashmap Collection View iterations and the size of the map proportional. The HashMap iteration time is likely to be expensive because it takes more time to scale than its capacity .

A linked hash map has two parameters that affect its performance: the initial capacity and the load factor . Their definitions are very similar to HashMap . It is important to note that a very high value selection for the initial capacity has a smaller effect on this class than the HashMap because the iteration time of this class is not affected by capacity.

Note that this implementation is not synchronous. if more than one thread accesses the linked hash map at the same time, and at least one of the threads modifies the mapping from the structure, it must remain externally synchronized. This is typically done by synchronizing the objects that naturally encapsulate the mapping. If such an object does not exist, you should use the Collections.synchronizedmap method to "wrap" the map. It is best to do this at the time of creation to prevent unintended non-synchronous access:

    Map m = collections.synchronizedmap (new Linkedhashmap (...));
Structural modification refers to the addition or deletion of one or more mapping relationships, or any action that affects the order of iterations in hash map that are linked in order of access. In the hash map shot that is linked by the insert order, only changes to the values associated with the included key in the map are not structural modifications. in the hash map shot, which is linked by Access order, only the get query mapping is not a structural modification.

The iterator returned by the iterator method of Collection (which is returned by all Collection view methods of this class) is a quick failure : After the iterator is created, if the mappings are modified from the structure, Iterators will throw concurrentmodificationexceptionunless the iterator itself removes the method and any other modification at any time in any way. Therefore, in the face of concurrent modifications, the iterator will soon fail completely without risking any uncertain behavior at any time in the uncertain future.

Note that the fast failure behavior of iterators is not guaranteed because, in general, it is not possible to make any hard guarantees as to whether or not there is a concurrency change in sync. A fast-failing iterator will do its best to throw concurrentmodificationexception. Therefore, the way to write a program that relies on this exception is wrong, and the correct approach is that the fast failure behavior of the iterator should only be used to detect program errors.

This class is a member of the Java collections Framework.

Java Hashtable Data structure

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.