How LinkedIn engineers optimize their Java code (RPM)

Last Update:2015-01-13 Source: Internet

Author: User

Tags rehash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

English Original: LinkedIn feed:faster with less JVM garbage

When I recently brushed the technical blogs of major companies, I found a very good blog post on LinkedIn's technology blog. This blog post describes the LinkedIn information middle tier feed Mixer, which provides support for multiple distribution channels such as LinkedIn's web home page, University homepage, company homepage, and client (as shown).

A library called SPR ("Super") is used in the feed mixer. The blog is about optimizing the Java code of the SPR. Here's how they summarize the optimization experience.

1. Take care of the cyclic traversal of Java

A list traversal in Java can be a lot more cumbersome than it looks. Take the following two-paragraph code as an example:

Private final list<bar> _bars;for (bar bar: _bars) {    //do important stuff}

Private final list<bar> _bars;for (int i = 0; i < _bars.size (); i++) {Bar bar = _bars.get (i);//do Important Stuff}

When code a executes, an iterator is created for the abstract list, and code B uses it directly get(i) to get the element, eliminating the overhead of the iterator relative to code a.

In fact, there is a need for some trade-offs. Code a uses iterators to ensure that the time complexity of acquiring an element is O (1) (used getNext() and hasNext() method), and the final time complexity is o (n) . But for code B, the time complexity of the loop is O (n) each time it is called _bars.get(i) (assuming the list is a linkedlist), then the time complexity of the entire loop of the final code B is O (n^2) ( But if the list in code B is ArrayList， the get(i) time complexity of the method is O (1) ).

So when deciding which traversal method to use, we need to consider the underlying implementation of the list, the average length of the list, and the memory used. Finally, because we need to optimize the memory and, ArrayList in most cases, find the time complexity of O (1) , we finally decided to choose the method used by code B.

2. Estimate the size of the collection at initialization time

From this Java document we can see that "a HashMap instance has two factors that affect its performance: the initial size and load factor (load factor)." [...] When the hash table size reaches the initial size and the load factor product, the hash table will be rehash operation [...] if you want to store multiple mapping relationships in a HashMap instance, We need to set a large enough initialization size to store the mapping relationship more efficiently instead of letting the hash table grow automatically so that it rehash, causing performance bottlenecks. "

In LinkedIn practice, you often encounter the need to traverse one ArrayList and save these elements HashMap inside. The HashMap expected size of this initialization avoids the overhead of re-hashing. The initialization size can be set to the input array size divided by the result value of the default load factor (take 0.7 here):

Pre-optimization code:

Hashmap<string,foo> _map;void addobjects (list<foo> input) {  _map = new hashmap<string, Foo> ();  for (Foo f:input)  {    _map.put (F.getid (), f);}  }

Optimized code

Hashmap<string,foo> _map;void addobjects (list<foo> input) {_map = new hashmap<string, Foo> ((int) Math.ceil (Input.size ()/0.7)); for (Foo f:input) {_map.put (F.getid (), f);}}

3. Calculation of deferred expressions

In Java, all method parameters are evaluated (left to right) before the method call, as long as the method argument is an expression. This rule will cause some unnecessary actions. Consider the following scenario: Use a ComparisonChain comparison of two Foo objects. One advantage of using such a comparison chain is that as long as a CompareTo method returns a non-0 value in the comparison process, the entire comparison ends, avoiding a lot of meaningless comparisons. For example, the objects to be compared in this scenario are the first to consider their score, then the position, and finally _bar the attribute:

public class Foo {private float _score;private int _position;private Bar _bar;public int compareTo (Foo other) {return C　　Omparisonchain.start ().　　Compare (_score, Other.getscore ()).　　Compare (_position, other.getposition ()).　　Compare (_bar.tostring (), Other.getbar (). toString ()). Result;}}

However, this implementation will always be two String objects to save bar.toString() other.getBar().toString() the value, even if the comparison of the two strings may not be required. To avoid this overhead, you can implement a comparator for the bar object:

public class Foo {private float _score;private int _position;private Bar _bar;private final Barcomparator bar_comparator = New Barcomparator ();p ublic int compareTo (Foo other) {    return Comparisonchain.start ().    Compare (_score, Other.getscore ()).    Compare (_position, other.getposition ()).    Compare (_bar, Other.getbar (), bar_comparator).    Result ();} private static class Barcomparator implements comparator<bar> {@Override public    int compare (bar A, bar b) {    Return a.tostring (). CompareTo (B.tostring ());}}

4. Pre-compiling regular expressions

The operation of a string is considered a costly operation in Java. Fortunately, Java provides some tools to make regular expressions as efficient as possible. Dynamic regular expressions are relatively rare in practice. In the next example, each invocation String.replaceAll() contains a constant pattern applied to the input values. So we pre-compile this pattern to save CPU and memory overhead.

Before optimization:

private string transform (string term) {    return outputterm = Term.replaceall (_regex, _replacement);}

After optimization:

Private final Pattern _pattern = Pattern.compile (_regex);p rivate string transform (string term) {    string outputterm = _ Pattern.matcher (term). ReplaceAll (_replacement);}

5. Cache it as much as possible if you can

Storing the results in the cache is also a way to avoid excessive overhead. But the cache only applies to the same data operations (such as preprocessing of some configurations or some string processing) that are used in the same data set. There are now multiple LRU (Least recently used) cache algorithms implemented, but LinkedIn uses the guava cache (see here for specific reasons) with the approximate code as follows:

Private final int max_entries = 1000;private final loadingcache<string, string> _cache;//Initializing the CACHE_CAC he = Cachebuilder.newbuilder (). MaximumSize (max_entries). Build (New cacheloader<string,string> () {@ Overridepublic string Load (String key) throws Exception {return Expensiveoperationon (key);}); /using the cachestring output = _cache.getunchecked (input);

6. The Intern method of string is useful, but it is also dangerous

The intern feature of String can sometimes be used instead of caching.

From this document, we can know:

"A Pool of strings, initially empty, is maintained privately by the class String. When the Intern method was invoked, if the pool already contains a string equal to this string object as determined by the Equals (Object) method, then the string from the pool is returned. Otherwise, this string object was added to the pool and a reference to this string object is returned ".

This feature is similar to caching, but there is a limitation that you cannot set the maximum number of elements that can be accommodated. Therefore, if these intern strings are not limited (for example, strings represent some unique IDs), then it can make memory consumption grow fast. LinkedIn used to stumble on this--it was the intern method for some key values, and everything was fine when the offline simulation was done, but once deployed, the system's memory took up (because a large number of unique strings were intern). So finally LinkedIn chooses to use the LRU cache, which limits the maximum number of elements.

Final result

The SPR memory footprint is reduced by 75%, reducing the memory footprint of the Feed-mixer by 50% (as shown). These optimizations reduce the generation of objects, thus reducing the frequency of GC, and the latency of the entire service is reduced by 25%.

http://kb.cnblogs.com/page/510538/

How LinkedIn engineers optimize their Java code (RPM)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More