Original memory optimization for feeds on Android--[udi Cohen] (https://www.facebook.com/udinic).
Millions of people run Facebook on Android devices, browsing news, information, events, pages, friends circles and the information they care about. A team called Android Feed Platform created a new platform to provide this information. Therefore, any optimization of this platform will benefit our application. We focus on the efficiency of scrolling, hoping to satisfy people's thirst for knowledge while enjoying a silky smooth experience.
To achieve this goal, we have made several automated tools to test the performance of the platform on different scenarios and devices, to measure the memory usage, frame rate, etc. of the code at runtime. When using one of the tools, TraceView, the test finds that Long.valueOf()
there are frequent calls, causing too many objects to accumulate in memory, resulting in a crash. This article describes how to solve this problem, we weigh the process of potential solutions, and the final optimizations we make to the platform.
The disadvantage of convenience
Through TraceView
Long.valueOf()
The discovery of the frequency of the call, we further tested the discovery, when scrolling news, the unexpected is that this method is more frequently called.
When we look at the call stack, we find that the method caller is not Facebook's code but the compiler's implicit code insertion. Call this method to put the long
boxing into Long
. Java supports both complex types and basic types (keywords like integer,long) and provides seamless transitions. This feature is called auto-boxing, which translates a basic type into a corresponding complex data type. Although it is a convenient feature, it also creates objects that are transparent to the developer.
These unknown objects occupy a sum of memory.
The app's heap dump detects Long
that the object consumes a large portion of memory, but each object itself is not large and the number is raspberry. Especially Dalvik
when the problem is easy to use, no more than the next generation of Android runtime ART
, has a generational garbage collection mechanism, can choose more appropriate time to recover a large number of small objects. When we scroll the news list up and down, a large number of objects are created, and the garbage-retract policy pauses the app to clean up the memory. The more objects accumulate, the more frequently memory is recycled, causing the application to stutter or even crash, leaving the user with inferior experiences.
Fortunately, owning TraceView
and Allocation Tracker
such a good tool helps us to analyze the causes of the lag and the problem is on automatic packing. We found that most of the operations were put into long
HashSet<Long>
place when the insertion occurred. (We use Hashset<Long>
to store the hash value of the news and verify that the news is unique). HashSet
provides fast object fetching because hash values are calculated by means of long
representations, but HashSet<Long>
interacting with complex objects, when we call setStories.put(lStoryHash)
, automatic boxing is unavoidable.
The solution is to inherit so that Set
it interacts with the simple type, and the result is not as simple as we thought it would be.
Possible solutions
There are subclasses that interact with simple types Set
, almost all of them ten years ago when Java or J2ME. To demonstrate the vitality of the new era, we decided to test it to Dalvik/ART
ensure that they performed well under more demanding conditions. We have written a lightweight test framework that these old libraries will be with at HashSet
a higher level. The results show that these old libraries are HashSet
faster and consume less memory, but they create objects internally, for example TLongHashSet
, a class of the Trove Library, with 1000 examples to test roughly 2MB of memory.
Other test libraries, such as PCJ and colt, have almost identical results.
Existing wheels do not meet our needs, so create a suitable wheel for Android Set
. Look at HashSet
the source code implementation, simple HashMap
to use to achieve complex functions.
public class HashSet<E> extends AbstractSet<E> implements Set<E>, ... { transient HashMap<E, HashSet<E>> backingMap; ... @Override public boolean add(E object) { return backingMap.put(object, this) == null; } @Override public boolean contains(Object object) { return backingMap.containsKey(object); } ... }
HashSet
adding an object inside means that it HashMap
adds a key and HashSet
its own instance as a value. HashSet
HashMap
checks the inserted object by whether the internal contains the same key value. You can choose to optimize with Android optimized Map
HashSet
.
Introduction Longarrayset
Perhaps you are familiar with LongSparseArray
it, it is provided by Android long
as a key Map
. This can be used:
LongSparseArray<String> longSparseArray = new LongSparseArray<>(); longSparseArray.put(3L, "Data"); String data = longSparseArray.get(3L); // the value of data is "Data"
LongSparseArray
is different from HashMap
that when called mapHashmap.get(KEY5)
, HashMap
is such a query of the
The entire value is obtained by using the hash value of the key value as the subscript and the value in the list. The complexity of the time is O (1).
and LongSparseArray
the value is this.
LongSparseArray
Find the key in the ordered list of keys by the binary lookup method, the time complexity is O (log N), and the key's subscript is taken to the corresponding value in the value queue.
HashMap
Additional space is required to avoid collisions, resulting in slower addressing. LongSparseArray
creating two lists makes the memory footprint small, but in order to support binary lookup, a contiguous amount of memory space is required. When the number of additions is greater than the current number of contiguous memory, a whole new contiguous memory is required. When the total length exceeds 1000, LongSparseArray
the performance is not ideal, there are huge performance problems (see official documents or Google's short video)
Because LongSparseArray
the keys are simple types, we can create a new data structure that will be LongSparseArray
used as HashSet
the inner class.
Just call LongArraySet
it!
The new data structure looks promising, but the first rule of optimization is "testing." Using the previous test framework, we tested the new data structure with the HashSet
addition of x item, examined the structure within them, and removed them. When adding different quantities (x=10,x=100,x=1000 ...), and averaging down the time of each operation, the result is:
We find Contains
and Remove
operate with performance improvements in the operating environment. As the number increases, the Add
operation takes more time. This is LongSparseArray
caused by the internal structure-when the quantity exceeds 1000, the performance is inferior HashMap
. In our own application, we only need to deal with hundreds of item, which is worth replacing.
We also see an increase in memory usage, Heap
and when we look at and Allocate Tracker
when we find that the number of object creation is reduced, here is
HashSet
and LongArraySet
Add 1000 item comparison results in 20 cycles.
LongArraySet
It's good to avoid creating Long
objects, which reduces memory allocations by 30% in this scenario.
Conclusion
By digging deeper into other data structures, we are able to create a more appropriate structure for ourselves. The fewer times a garbage collection runs, the less it takes to drop frames. The use of new data structures LongArraySet
, as well int
as keys IntArraySet
, optimizes the memory usage of the entire application.
This example shows that any feasible method needs to be measured to derive the optimal solution. We also accept that this approach is not universal, and for heavyweight data, this increase is limited, but for us, it is more optimal.
You can find the source code here, we are also excited about the next challenge, we hope to have the opportunity to share more dry goods in the future.
Translation Facebook: Android-based memory optimization