Java collections-hashmap deep analysis and comparison

Source: Internet
Author: User
In the world of Java, the structure processing of classes and various types of data is the key to the logic and performance of the entire program. As I encountered a problem that the performance and logic coexist at the same time, I began to study this problem. I searched for forums, large and small, and also read the Java Virtual Machine specification, apress ,. java. collections. (2001 ),. BM. ocr.6.0.20.connector, and thinking in Java fail to find a good answer, so I was so angry that I decompressed the jdk src and studied it. Then I wrote this article, I will share my feelings with you and verify my understanding that there are no vulnerabilities. Here we will use hashmap for Research (Xi ~~ Yes)

Hashmap is a great utility of JDK. It maps various objects and implements fast access to the corresponding "key-value. What does it actually do?

Before that, we will introduce the attributes of the load factor and capacity. We all know that the actual capacity of a hashmap is actually a factor * capacity. The default value is 16 × 0.75 = 12. This is very important and has a certain impact on efficiency! When the size of the objects stored in hashmap exceeds this capacity, hashmap reconstructs the access table. This is a big problem. I will introduce it later. Anyway, if you already know how many objects you want to store, you 'd better set it to an acceptable number for the actual capacity.

Two key methods are put and get:
First, we have such a concept. hashmap declares the map, cloneable, and serializable interfaces, and inherits the abstractmap class. The iterator in it is mainly implemented by its internal class hashiterator and several other iterator classes, of course, there is also a very important inheritance of map. the entry internal class. Since everyone has the source code, you can take a look at this part. I mainly want to describe the entry internal class. It contains the hash, value, key, and next attributes, which are very important. The source code of put is as follows:

Public object put (Object key, object Value ){
Object K = masknull (key );

This is to judge whether the key value is null. It is not very esoteric. In fact, if it is null, it will return a static object as the key value, which is why hashmap allows the null key value.

Int hash = hash (k );
Int I = indexfor (hash, table. Length );

The two consecutive steps are the best part of hashmap! I feel ashamed after the study. Hash is to hash the object's hashcode through the key, and then obtain the index value in the object table through indexfor.

Table ??? Don't be surprised. In fact, hashmap can't go anywhere. It uses tables to put it. The best way is to use hash to correctly return indexes. I have contacted Doug, the JDK author, about the hash algorithm. He suggested that I look at the art of Programing vol3, which I have been looking for before, I couldn't find it. When I mention it like this, I am even more anxious. Unfortunately, my pocket is empty ~~ 5555

I wonder if you have noticed that put is actually a method with a returned value. It will overwrite the put with the same key value and return the old value! The following method thoroughly illustrates the structure of hashmap. In fact, a table is added with a linked list of entries at the corresponding position:

For (Entry E = table [I]; e! = NULL; E = E. Next ){
If (E. Hash = hash & eq (K, E. Key )){
Object oldvalue = E. value;
E. value = value; // assign the new value to the corresponding key value.
E. recordaccess (this); // empty method, left for implementation
Return oldvalue; // return the old value corresponding to the same key value.
}
}
Modcount ++; // number of structural changes
Addentry (hash, K, value, I); // Add new elements, key!
Return NULL; // return without the same key value
}

The key methods are analyzed as follows:

Void addentry (INT hash, object key, object value, int bucketindex ){
Table [bucketindex] = new entry (hash, key, value, table [bucketindex]);

Because the hash algorithm may make different key values have the same hash code and have the same table index, for example: the hash values of key = "33" and key = Object G are-8901334, so the indexes after indexfor must be I, in this way, the next entry of the new entry will point to the original table [I], and the next entry will also be like this, forming a linked list, and configuring e with the put loop. next to get the old value. Here, the structure of hashmap is quite understandable?

If (size ++> = threshold) // this threshold is the actual capacity
Resize (2 * Table. Length); // If the size exceeds this limit, the object table will be restructured.

The so-called refactoring is also not a god, that is, building a table that is twice the size (I saw someone in other forums saying that it was twice as much as 1 and cheated me), and then one by one indexfor in! Note !! This is efficiency !! If you do not need to refactor your hashmap many times, the efficiency will be greatly improved!
}

It's almost the same here. Get is much easier than put. You can understand put and get. For collections, I think it is suitable for a wide range of scenarios. If it is not perfect for uniqueness, if your program requires special purposes, write it by yourself. It is actually very simple. (The author told me this way. He also suggested that I use linkedhashmap. After reading the source code, I found that linkhashmap actually inherits hashmap and then override the corresponding method, if you are interested, you can use looklook to create an object table and write the corresponding algorithms.

For example, just like a vector, a list, or something is actually very simple. There are a maximum of synchronous declarations. In fact, if you want to implement a file like a vector, there are not many inserts and deletions, you can use an object table to access and add data by index.
If many objects are inserted and deleted, you can create two object tables. Each element is saved as a table containing the next structure. If you want to insert an object to I, but I already has elements, connect with next, and then size ++, and record it in another table.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.