Hashmap,arraymap,sparsearray Source Analysis and performance comparison _ algorithm

Source: Internet
Author: User

Arraymap and Sparsearray are Android's system APIs that are specifically tailored for mobile devices. For the purpose of saving memory by replacing HashMap in some cases.

I. Source analysis (due to space constraints, the source Analysis section will be placed in a separate article)
Two. Realization principle and data structure contrast
Three. Performance Test comparison
Four. Summary

I. Source Analysis
Will be added later in the next article (written in one, too long)

Two. Realization principle and data structure contrast
1. HashMap
Paste_image.png

It can be seen from the structure of the hashmap that the key value is first hashed, the position in the table array is determined according to the hash result, and the open chain address method is used when the hash conflict occurs. The data structure of map.entity is as follows:

Static Class Hashmapentry<k, V> implements Entry<k, v> {    
final K key;    
V value; 
final int hash;   
 Hashmapentry<k, v> next;

Specific HashMap source details will be analyzed in other articles, here can be seen, from the perspective of space analysis, HashMap will have a table array that uses no more than the load factor (default 0.75), followed by a hashmapentry record for each piece of data in the HashMap, and a hash value and a pointer to the next entity, in addition to recording key,value.
Time efficiency, the use of hash algorithm, insert and find operations are very fast, and generally, each of the array values will not exist after a long list (because of the occurrence of a hash conflict, after all, a relatively small proportion), so regardless of space utilization, hashmap efficiency is very high.

2.ArrayMap
Paste_image.png


Arraymap uses two arrays, mhashes to save the hash value of each key, Marrray twice times the size of mhashes, and then save key and value. The details of the source code will be explained in the next article. Now let's just throw away the details and look at the key statements:

Mhashes[index] = hash;
MARRAY[INDEX<<1] = key;
marray[(index<<1) +1] = value;

I believe that we all understand the principle of this. But how does it inquire? The answer is a two-point lookup. When inserting, the hash value is obtained according to the Hashcode () method of the key, the index position of Marrays is computed, and then the binary lookup is used to find the corresponding position to insert, and when a hash conflict occurs, it is inserted in the adjacent position of index.
To sum up, space perspective, arraymap each store a piece of information, you need to save a hash value, a key value, a value. By contrast, the HashMap roughly looks, but reduces a pointer to the next entity. There is also the saving of a portion of the memory savings on the visible space is not particularly noticeable. Is that so? will be validated later.
Time efficiency, when inserting and searching because of the use of the two-point method, find the time should be no hash lookup fast, insert the time, if the order of the insertion of the efficiency is certainly high, but if it is random insertion, will certainly involve a large number of array removal, data volume, certainly not, think again, if it is unfortunate, Each insertion of the hash value is smaller than the previous one, then the second move, efficiency will not carry the sense of foot.

3.SparseArray
Paste_image.png

Sparsearray is relatively simple, but do not think that it can replace the first two, Sparsearray can only be used when the key is int, the note is int rather than integer, which is also a point of sparsearray efficiency improvement, The boxed operation was removed!.
Because the key is int also does not need what hash value, as long as the int value equal, that is the same object, simple and rough. Insert and search is also based on the second Division, so the principle and arraymap basically consistent, here is not much to say.
To sum up: space contrast, with hashmap, remove the hash value of the storage space, no next pointer occupy, there are some other small memory footprint, looking at a lot of savings.
Time comparison: the insertion and lookup situation and Arraymap are basically the same, there may be a large number of array removal. But it avoids the boxing link, do not underestimate the boxing process, or very time-consuming. So from the source point of view, efficiency who fast, it depends on the size of the data.

OK, say half-day is analysis, below to point practical, with data to speak.

Three. Performance Test comparison
Let's try to compare the two aspects of inserting and querying.

1. Insert Performance Time Comparison
Test code:

Long start = System.currenttimemillis ();
Map<integer, string> hash = new Hashmap<integer, string> ();
for (int i = 0; i < MAX; i++) { 
   hash.put (i, i+ "");
}
Long ts = System.currenttimemillis ()-Start;

Just paste this section, the other two pieces of code is simply to replace the HashMap, by changing the max value on the line contrast.
Paste_image.png


Analysis: Judging from the results, the amount of data is small, the difference is not big (of course, the amount of data small, time benchmark small, too much content, do not stick to the data table, does not really differ), when the amount of data is more than 5000, Sparsearray, fastest, HashMap slowest, at first glance, It seems that Sparsearray is the fastest, but note that this is inserted sequentially. Which is the ideal situation for Sparsearray and Arraymap.

Let's try a reverse-insertion.

Long start = System.currenttimemillis ();
Hashmap<integer, string> hash = new Hashmap<integer, string> ();
for (int i = 0; i < MAX; i++) {    
hash.put (max-1-i, i+ "");
}
Long ts = System.currenttimemillis ()-Start;

Paste_image.png


Analysis: From the result, sure enough, hashmap far more than Arraymap and Sparsearray, also in front of the analysis consistent.
Of course, when the amount of data is small, such as under 1000, this time difference can be ignored.

Here's a look at the space comparison: first, the test method, because the test memory, so especially to note that the test process does not occur GC, if the GC, the data is not allowed, think about, using a relatively simple method:

Runtime.getruntime (). TotalMemory ()//Get the total memory that the application has applied to
Runtime.getruntime (). Freememory ()//Get free part of application memory

The difference between two methods is the part of the memory that the application has already used.
Paste_image.png


It is worth noting that when Max value is large, a GC may occur during code execution, at which point the memory can be monitored with the memory window of the Android monitor, and the results of the GC process are not valid. Suppose that when the data volume is large, a manual GC once per test is completed, this is basically a test success every time, because the amount of data is not particularly large, only a very small part of the test process will occur GC, so there is no further explore other ways, such as setting virtual machine parameters to extend the GC time, you can do it. Data on:
Paste_image.png

Visible, Sparsearray in memory footprint is indeed better than HashMap and arraymap many, through data observation, roughly save about 30%, and Arraymap performance as mentioned earlier, the optimization function is limited, almost and hashmap the same.

2. Find Performance Comparison

Long start = System.currenttimemillis ();    
sparsearray<string> hash = new sparsearray<string> ();
for (int i = 0; i < MAX; i++) {   
 hash.get (i);
}
Long ts = System.currenttimemillis ()-Start;

Paste_image.png


Found Sparsearray fastest, hashmap slowest, found inconsistent with the previous assumptions, binary search is faster than the hash.
Again, because it is a bit unfair to test with such code, because Sparsearray is not boxed, HashMap has a boxing process, it seems unfair. So think of a way to test it again,

Arraylist<intentity> intentitylist=new arraylist<intentity> ();
private void Boxing () {for  
  (int i=0;i<max;i++) {      
  intentity entity=new intentity ();  
      entity.i1=i;        
    Entity.i2=integer.valueof (i);       
 Intentitylist.add (entity);  
  }
Class intentity{    
 int i1;   
 Integer i2;
}

It seems fair to give HashMap and arraymap the time to pack them in advance.

Long start = System.currenttimemillis ();
Hashmap<integer, string> hash = new Hashmap<integer, string> ();
for (int i = 0; i < MAX; i++) { 
 //  hash.get (i); 
  Hash.get (Intentitylist.get (i). i2);
}
Long ts = System.currenttimemillis ()-Start;

Paste_image.png


Sure enough, the results are not the same, HashMap is the fastest query, this is in line with the logic, but we are often used when it is not boxed, so the combination or use of Sparsearray efficiency is the highest.

So much, and finally to the conclusion of the time.
Four, Summary
1. When the amount of data is small, generally think under 1000, when your key is int, using Sparsearray is really a good choice, memory can save 30%, compared with HashMap, because it key value does not need boxing, So the time performance on average is also better than HashMap, recommended use.
2.ArrayMap relative to Sparsearray, the feature is that the key value type is unrestricted, in any case can replace the HashMap, but through research and testing found that Arraymap memory savings is not obvious, is about 10%, But time performance is indeed the worst, of course, 1000 of the amount of data does not matter, plus it only in the api>=19 can be used, personal advice is not necessary to use. It is better to use hashmap rest assured. This is also the reason why we have not been prompted to use Google when we are new to HashMap.

Author: jjlanbupt
Link: http://www.jianshu.com/p/7b9a1b386265
Source: Jane book
Copyright belongs to the author. Commercial reprint please contact the author to obtain authorization, non-commercial reprint please indicate the source.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.