Preface
As a programmer with code cleanliness, when writing Android applications, I always pay attention to
- Code specification (Google Android guideline)
- Code that can be done in one line, never write two lines
- Never let the compiler (IntelliJ, as) have yellow on the right scroll bar
- Don't repeat Yourself
Of course, in the actual development, the compiler reported warning some not very good to avoid, such as some empty pointers, the compiler from the Android source, think that there will be no null pointer, but the actual situation .... You know, some of the Rom hand to change the source, the result is crash, so we can do, is to try to reduce warning.
Pull so much, say back to the theme, sometimes in HashMap affirm that line, IntelliJ will report warning, say with Sparsearray better, then sparsearray what is what, why better, why say more save memory?
This article takes the source code of API 21 as the subject
Sparsearray Source Code Analysis
//E corresponds to the value of HashMap Public class Sparsearray<E> implements cloneable { //To optimize deletion performance, Mark objects that have been deleted Private Static FinalObject DELETED =NewObject ();//used to optimize deletion performance, flag whether garbage collection is required Private BooleanMgarbage =false;//Storage index, integer index from small to large is mapped in the array Private int[] Mkeys;//Storage Objects PrivateObject[] mvalues;//Actual size Private intMsize;
constructor function:
publicSparseArray(int initialCapacity) { if0) { // EmptyArray是一个不需要数组分配的轻量级表示。 mKeys = EmptyArray.INT; mValues = EmptyArray.OBJECT; else { mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity); newint[mValues.length]; } 0; }
The memory is also fully saved here. Newunpaddedobjectarray finally points to a native method of Vmruntime.
/** * 返回一个至少长minLength的数组,但可能更大。增长的大小来自于避免数组后的任何padding。padding的大小依赖于componentType和内存分配器的实现 */publicnativenewUnpaddedArrayint minLength);
The Get method uses a two-point lookup
/** * Gets the mapping object for the specified key, or null if the mapping is not available. */ PublicEGet(intKey) {returnGet (Key,NULL);}@SuppressWarnings("Unchecked") PublicEGet(intKey, E Valueifkeynotfound) {//Two points find inti = Containerhelpers.binarysearch (Mkeys, msize, key);//If not found or the value has been flagged for deletion if(I <0|| Mvalues[i] = = DELETED) {returnValueifkeynotfound; }Else{return(E) mvalues[i]; }}
The corresponding binary search implementation here does not repeat, roughly is a sequential array of two to compare the intermediate value and the target value of the circular lookup.
One of the more ingenious is that it returns an index of ~low when it is not found, which has two functions:
- Tell the caller not to find
- The caller can directly use ~result to get the position where the element should be inserted (-(insertion point)-1).
The corresponding Put method
/** * Adds a mapping that specifies the key to the specified object, replacing the original mapping object directly if one of the previously specified keys is mapped. */ Public void put(intKey, E value) {inti = Containerhelpers.binarysearch (Mkeys, msize, key);if(I >=0) {//The key is original and replacedMvalues[i] = value; }Else{//Do a negative operation to get the index that should be insertedi = ~i;//size is sufficient and the original value has been marked for deletion if(I < msize && mvalues[i] = = DELETED) {Mkeys[i] = key; Mvalues[i] = value;return; }//Come here and say I'm out of size, or the corresponding element is valid //is marked as requiring garbage collection and the Sparsearray size is not less than the keys array length if(Mgarbage && msize >= mkeys.length) {//Compressed space (here the source is a bit funny, even there are log.e comments left there, it appears that the Android source engineer also to debug), will compress the array, the invalid values are removed, to ensure continuous effective valueGC ();//Search again because the index may changei = ~containerhelpers.binarysearch (Mkeys, msize, key); }//INSERT, if size is not enough, the array will be reassignedMkeys = Growingarrayutils.insert (Mkeys, msize, I, key); Mvalues = Growingarrayutils.insert (mvalues, Msize, I, value);//Actual size plus 1msize++; }}
Look at the Remove (Del) method
/** * If any, delete the mapping of the corresponding key */ Public void Delete(intKey) {//Two-point search again inti = Containerhelpers.binarysearch (Mkeys, msize, key);//exists the tag corresponds to value deleted, and the position mgarbage if(I >=0) {if(Mvalues[i]! = DELETED) {Mvalues[i] = DELETED; Mgarbage =true; } }}/** * {@link #delete (int)} alias. */ Public void Remove(intKey) {Delete (key);}/** * Delete the mapping of the specified index (this is a bit violent ah, use should be relatively small, directly designated position) */ Public void removeAt(intIndex) {//Is the method in the delete, so why does the delete not call RemoveAt and repeat the code? if(Mvalues[index]! = DELETED) {Mvalues[index] = DELETED; Mgarbage =true; }}
Roughly looking at these methods of crud, there should be a preliminary understanding of it, sparsearray because key is a purely integer array, avoids auto-boxing key and additional data structure to map k/v relationship, thus saving memory.
Of course, there is also a tradeoff, because the key array needs to be ordered, so each time a relatively simple write operation takes more time, to go to the binary search, to delete/insert elements in the array. Therefore correspondingly in order to optimize, because of the mgarbage and the deleted, the possible multiple GC is merged into a single, deferred to the time required to execute.
Source comments also mentioned that the class is not suitable for large data volume, hundreds of entry when the difference is within 50%, is still acceptable, after all, on the mobile side of what we lack, is often memory, rather than CPU.
HashMap Source Code Analysis
The implementation of SPARSEARRAY,HASHMAP is a bit more complicated because it supports multiple keys (or even null) and implements iterator. Here we mainly look at the basic operation implementation.
The /** * Transient keyword indicates that serialization is not required. * Hash table, NULL key is below. * Hashmapentry defines k/v mappings, hash values, and next elements */transientHashmapentry<k, v>[] table;/** * The entry represents a null key, or the mapping does not exist. */transientHashmapentry<k, v> Entryfornullkey;/** * Map number of hash map. * /transient intSize/** * Structure modification when self-increment, to do (maximum effort) concurrent modification detection */transient intModcount;/** * The hash table will be rehash when the size exceeds the threshold. Typically the value is 0.75 * capacity, unless the capacity is 0, which is the empty_table declaration above. */Private transient intThreshold Public HashMap(intcapacity) {if(Capacity <0) {Throw NewIllegalArgumentException ("Capacity:"+ capacity); }if(Capacity = =0) {//and similar to Sparsearray, all empty instances share the same empty representationHashmapentry<k, v>[] tab = (hashmapentry<k, v>[]) empty_table; Table = tab;//Force first put () to replace Empty_tableThreshold =-1;return; }if(Capacity < minimum_capacity) {capacity = Minimum_capacity; }Else if(Capacity > Maximum_capacity) {capacity = Maximum_capacity; }Else{//Is it for memory paddingCapacity = Collections.rounduptopoweroftwo (capacity); } maketable (capacity);}/** * Assign a hash table to a given capacity and set corresponding thresholds. * @param newcapacity must be 2 times */PrivateHashmapentry<k, v>[]maketable(intnewcapacity) {//Do you want to do this for the sake of synchronicity?Hashmapentry<k, v>[] newtable = (hashmapentry<k, v>[])NewHashmapentry[newcapacity]; Table = newtable; Threshold = (newcapacity >>1) + (Newcapacity >>2);//3/4 capacity returnnewtable;}
Then the code that was inserted
/** * Maps the specified key to the specified value, and returns null if there is an original corresponding mapping returns its value. */@Override PublicVput(K key, V value) {if(Key = =NULL) {//key is empty direct go to Putvaluefornullkey method returnPutvaluefornullkey (value); }//Calculate the hash value of the key, make two hash according to the hashcode of key itself inthash = Collections.secondaryhash (key); Hashmapentry<k, v>[] tab = table;intIndex = hash & (Tab.length-1); for(hashmapentry<k, v> e = Tab[index]; E! =NULL; E = e.next) {//Original corresponding entry, modify value directly, then return oldvalue if(E.hash = = Hash && key.equals (E.key)) {premodify (e); V oldValue = E.value; E.value = value;returnOldValue; } }//No existing entry, create amodcount++;//size exceeds threshold, double capacity, recalculate index if(size++ > Threshold) {tab = Doublecapacity (); Index = hash & (Tab.length-1); }//Insert a new hashmapentry in the table's index position, and its next is its ownAddnewentry (key, value, hash, index);return NULL;}PrivateVPutvaluefornullkey(V value) {hashmapentry<k, v> entry = Entryfornullkey;//And similar logic above, replace if present, otherwise create the Hashmapentry if(Entry = =NULL) {Addnewentryfornullkey (value); size++; modcount++;return NULL; }Else{premodify (entry); V oldValue = Entry.value; Entry.value = value;returnOldValue; }}
Let's just say here, it's generally clear that HashMap is a hash-centric implementation, and on size, there's only double logic, and there's no logic to shrink capacity after remove. The cost of Time complexity O (1) is to consume a lot of memory to store data.
Comparison
HashMap-Fast, wasted memory
Sparsearray-a performance loss, saving memory
We do a simple performance test with no parameters to initialize HashMap and Sparsearray, as well as store integer->string mappings
Time-consuming (milliseconds) to split HashMap and Sparsearray with: symbols
Experiment |
First time |
Second time |
Third time |
four times |
Put 10,000 numbers in turn |
13:7 |
12:6 |
76:4 |
14:6 |
A random put 10,000 numbers |
16:83 |
18:85 |
15:76 |
17:78 |
4 test results are similar (there is a noise data 76/4 to be studied), HashMap in the orderly key time, more time-consuming, and without the key is sparsearray more time-consuming, and hashmap is not much performance difference.
After the result of the random put is done 10,000 times get, get the result of 7:3,7:3,8:3, visible get operation, Sparsearray performance is better, but even in 10,000 entry, the difference is not really big.
In memory, see Sparsearray in Allocation Tracker for 32, and HashMap reached 69632 this terrible number ...
Conclusion
In cases where key is an integer, considering that the mobile side is often k/v, the Sparsearray can save memory and performance loss in an acceptable range.
From the test results, Sparsearray compared to HashMap, greatly saving memory, mobile is not two of the choice.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
What Sparsearray IntelliJ is always warning about-HashMap's replacement