10 Simple Java Performance optimizations

Last Update:2017-11-14 Source: Internet

Author: User

Tags iterable string methods java 8 stream

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Are you planning to optimize the Hashcode () method? Do you want to bypass the regular expression? Lukas Eder introduces a number of simple and convenient performance optimization tips as well as extension program performance techniques.

Recently, the term "full domain (Web scale)" has been scrambled, and people are expanding their application architectures to make their systems more "full-domain." But what exactly is the whole domain? Or how to ensure the whole domain?

Different aspects of the extension

The most hype in the entire domain is the extended load (scaling load), such as a system that supports individual user access and can support 10, 100, or even 1 million users. Ideally, our system should remain as "stateless (stateless)" as possible. Even if a state is required, it can be transformed and transmitted on different processing terminals of the network. When the load becomes a bottleneck, there may be no delay. Therefore, it is acceptable to consume 50 to 100 milliseconds for a single request. This is known as horizontal scaling (scaling out).

Extensions are completely different in full-domain optimization, such as ensuring that algorithms that successfully process a single piece of data can successfully process 10, 100, or even 1 million of data. The event complexity (large o symbol) is the best description, regardless of whether this type of measurement is feasible. Latency is the performance extension killer. You will do all you can to do all the arithmetic processing on the same machine. This is known as vertical scaling (scaling up).

If you can drop the pie in the sky (which is impossible, of course), we may be able to combine horizontal and vertical expansion. However, today we are only going to introduce the following simple ways to improve efficiency.

Large o symbol

Java 7 's Forkjoinpool and Java8 parallel data streams (parallel stream) are useful for parallel processing. This is especially true when deploying Java programs on multicore processors, since all processors can access the same memory.

As a result, this parallel processing is more scalable than on different machines across the network, and the fundamental benefit is that latency can be eliminated almost entirely.

But don't be fooled by the effects of parallel processing! Keep in mind the following two points:

Parallel processing eats up processor resources. Parallel processing brings great benefits to batching, but it is also a nightmare for non-synchronous servers such as HTTP. There are many reasons why we have been using a single-threaded servlet model in the last few decades. Parallel processing can bring real benefits only in the case of vertical scaling.
Parallel processing has no effect on the complexity of the algorithm. If your algorithm has a time complexity of O (NLOGN), let the algorithm run on C processors, the event complexity is still O (nlogn/c), because C is just an insignificant constant in the algorithm. You save only the clock time (Wall-clock times), and the actual algorithm complexity does not degrade.

Reducing the complexity of the algorithm is undoubtedly the most effective way to improve performance. For example, for a HashMap instance of the lookup () method, the event complexity O (1) or Space complexity O (1) is the fastest. But this situation is often impossible, let alone easy to achieve.

If you can not reduce the complexity of the algorithm, you can also find the key points in the algorithm and improve the method, to play a role in improving performance. Suppose we have the following algorithm:

The overall time complexity of the algorithm is O (N3), and the complexity is O (N x O x P) if calculated in individual access order. But anyway, when we analyze this code, we find some strange scenarios:

In the development environment, you can see from the test data that the time complexity M value of the left branch (N->m->heavy operation) is greater than the right O and P, so we only see the left branch in our parser.
In a production environment, your maintenance team may find it through AppDynamics, DynaTrace, or other gadgets, and the real culprit is the right branch (easy operation or also N.O.P, N-O-P) . E. ）。

In the absence of production data references, we may easily draw conclusions to optimize the "high overhead operation." But the optimizations we make don't have any effect on the delivered product.

The golden rule of optimization is nothing less than:

A good design will make optimization easier.
Premature optimization does not solve many performance problems, but poor design will lead to an increase in the difficulty of optimization.

Theory comes first here. Suppose we have found that the problem is on the right branch, most likely because the simple processing in the product has taken a lot of time to lose its response (assuming that the values of N, O and P are very large), please note that the time complexity of the left branch mentioned in the article is O (N3). The effort here is not scalable, but it can save users time and postpone the difficult performance improvements to the back.

Here are 10 tips to improve Java performance:

1. Using StringBuilder

Stingbuilder should be used by default in our Java code and should avoid using the + operator. You may have different opinions about StringBuilder's grammatical sugars (syntax sugar), such as:

"B";

will be compiled as:

0NewJava. lang. StringBuilder[16] 3DUP 4LDC <String "A ">[18] 6Invokespecial java . lang . StringBuilder ( java . lang . String)  [9]  aload_0  [args]10  arraylength11  invokevirtual  java . lang  . StringBuilder . Append ( int):  java . lang . StringBuilder  [23]14  LDC <  String " B" >  [27]16  invokevirtual  java  . Lang . StringBuilder . Append ( java . lang . String):  java . lang . StringBuilder  [29]19  invokevirtual  java . lang . StringBuilder . toString ():  java . lang . String  [32]22  astore_1  [x]

But what the hell happened? Next, do you want to use the following section to improve the String?

"B";  1)    x = x + args[0];

The second StringBuilder is now used, and the StringBuilder does not consume the extra memory in the heap, but it stresses the GC.

New StringBuilder ("a"); x.append (args.length); x.Append ("B");  1); X.Append (args[0]);

Summary

In the example above, if you are relying on the Java compiler to implicitly generate instances, then the effect of compilation is almost unrelated to whether you use the StringBuilder instance. Remember: In the N.O.P.E branch, the time of each CPU cycle is wasted in GC or allocated default space for StringBuilder, we are wasting N x O x P time.

In general, the effect of using StringBuilder is better than using the + operator. If possible, select StringBuilder if you need to pass a reference across multiple methods, because String consumes additional resources. Jooq used this way in generating complex SQL statements. The entire abstract syntax tree (AST abstract Syntax tree) uses only one StringBuilder during SQL pass.

More tragic is that if you are still using StringBuffer, then use StringBuilder instead of StringBuffer, after all, the need to synchronize strings is really not much.

2. Avoid using regular expressions

Regular expressions give the impression that they are quick and easy. But using regular expressions in the N.O.P.E branch will be the worst decision. If it is a last resort to use regular expressions in computationally intensive code, you should at least cache the pattern to avoid compiling pattern repeatedly.

Final pattern Heavy_regex =    pattern.  Compile (((((*y) *z) * ");

If you only use a simple regular expression such as the following:

string[] Parts = ipAddress. Split ("//.");

It is best to use a normal char[] array or an index-based operation. For example, the lower readability of this code actually plays the same role.

Length = Ipaddress.length ();  0;  0; For (2;}}

The above code also indicates that premature optimizations are meaningless. Although compared to the split () method, this code is poorly maintainable.

Challenge: can a smart little partner come up with a faster algorithm?

Summary

Regular expressions are useful, but they also pay a price when used. Especially in the depths of the N.O.P.E branch, avoid using regular expressions at all costs. Also beware of various JDK string methods that use the regular expression, such as String.replaceall () or String.Split (). You can choose to use a more popular development library, such as Apache Commons Lang, for string manipulation.

3. Do not use the iterator () method

This recommendation does not apply to general occasions and is only applicable to scenes deep in the N.O.P.E branch. Even so, there should be some understanding. The syntax of the Java 5 format is so convenient that we can forget the internal loop method, such as:

value:strings) {    //do something useful here}

When each code runs to this loop, if the strings variable is a iterable, the code will automatically create an instance of iterator. If you are using ArrayList, the virtual opportunity automatically allocates 3 integer-sized memory to the object on the heap.

iterator<int lastret =-//...

can also use the following equivalent loop to replace the above for loop, just on the stack "wasted" a little plastic, quite cost-effective.

Size = strings. size (); For (//does something useful here}

If the value of the string in the loop is not changed, an array can also be used to implement the loop.

Value:stringarray) {    //do something useful here}

Summary

Iterators, iterable interfaces, and foreach loops are very useful both from an easy-to-read perspective and from an API design standpoint. But the price is that when you use them, an extra object is created on the heap for Each loop child. If the loop is going to be executed many and many times, be careful to avoid generating meaningless instances, preferably using a basic pointer loop instead of the above iterators, iterable interfaces, and foreach loops.

Discuss

Some objections to the above content (especially the use of pointers to replace iterators) are described in the discussion on Reddit.

4. Do not call the high overhead method

Some methods cost a lot of money. Taking the N.O.P.E branch as an example, we did not mention the relative methods of the leaves, but this can be. Suppose our JDBC driver needs to overcome all odds to calculate the return value of the Resultset.wasnull () method. Our own implementation of the SQL framework might look like this:

if (type = = Integer.class) {    result = (T) wasnull (RS,        integer.valueof (Rs.getint (index)));} //And then ... wasnull(ResultSet rs, T value)null:value;}

In the above logic, the Resultset.wasnull () method is called each time an int value is obtained from the result set, but the method of getInt () is defined as:

Return type: The value of the variable, or 0 if the SQL query result is NULL.

So a simple and effective method of improvement is as follows:

Wasnull (value) throws SQLException {    return (null | | (value;}

It's a breeze.

Summary

Cache method calls instead of high-overhead methods on leaf nodes, or avoid calling high-overhead methods if the method contract allows them.

5. Using the original type and stack

A large number of generics are used in the example from Jooq, resulting in wrapper classes that use Byte, short, int, and long. But at least generics should not be a limitation of code until they are specialized in Java 10 or Valhalla projects. Because the substitution can be done in the following way:

817598;

...... If this is the case:

Stored on the stack  817598;

The situation can get worse when working with arrays:

424242};

...... If this is the case:

Only one object   424242} is generated on the heap;

Summary

When we are in N.O.P.E. In the depths of the branch, you should avoid using wrapper classes. The downside to this is that the GC is under a lot of pressure. The GC will be busy getting rid of the objects generated by the wrapper class.

So an effective optimization method is to use the basic data type, fixed-length array, and a series of split variables to identify where the object is located in the array.

TROVE4J, which follows the LGPL protocol, is a Java Collection Class library that gives us a better performance implementation than the shaping array int[].

Exception

The following is an exception to this rule: because the Boolean and byte types are not enough to allow the JDK to provide caching methods for it. We can write this:

... syntax sugar for:boolean.valueof (true);  Byte B1 = (//... syntax sugar for:byte.valueof ((123);

Other integer basic types also have similar conditions, such as char, short, int, long.

Do not automatically boxing these integral types or call the Thetype.valueof () method when you invoke the constructor method.

Do not call the constructor on the wrapper class, unless you want an instance that is not created on the heap. The advantage of doing this is to give your colleague a giant pit April Fool's joke.

Non-heap storage

Of course, if you still want to experience the next pile of libraries, though it may be mixed with a lot of strategic decisions, not the most optimistic local scenario. An interesting article written by Peter Lawrey and Ben Cotton On non-heap storage click: OpenJDK and hashmap--let the veteran master safely (non-heap storage!) ) New Tricks.

6. Avoid recursion

Functional programming languages like Scala are now encouraged to use recursion. Because recursion usually means that it can be decomposed into individual individual optimized tail recursion (tail-recursing). It would be nice if you used a programming language that would support that. But even so, it is important to note that minor adjustments to the algorithm will turn the tail recursion into a normal recursive return.

Hopefully the compiler will be able to detect this automatically, otherwise we would have wasted a lot of stack frames for things that could be done with just a few local variables.

Summary

There is nothing to say in this section, except to use iterations as much as possible in the N.O.P.E branch instead of recursion.

7. Using EntrySet ()

When we want to traverse a Map saved as a key-value pair, we have to find a good reason for the following code:

Key:map.keySet ()) {    value:map.get (key);}

Let alone the following wording:

For (Entry<k, v> entry:map.entrySet ()) {    K key = Entry.getkey ();    V value = Entry.getvalue ();}

The map should be used with caution when we use the N.O.P.E branch. Because many access operations that seem to have a time complexity of O (1) are actually made up of a series of operations. And the access itself is not free. At least, if you have to use a map, then use the EntrySet () method to iterate! In this case, we are only going to visit the example of Map.entry.

Summary

Be sure to use the EntrySet () method when you need to iterate over a map of key-value pairs.

9. Use Enumset or Enummap

In some cases, such as when using configuration map, we may know in advance the key values stored in the map. If this key value is very small, we should consider using Enumset or enummap instead of using our usual HashSet or HashMap. The following code gives a very clear explanation:

Private transient object[] vals;   put (int index = key.ordinal (); Vals[index] = Masknull (//...}

The key implementation of the above code is that we use arrays instead of hash tables. Especially when inserting new values into a map, all you have to do is get a constant sequence number generated by the compiler for each enumerated type. If there is a global map configuration (for example, only one instance), Enummap will get a better performance than HashMap under the pressure of increased access speed. The reason for this is that the heap memory used by Enummap is less than HashMap (bit), and HashMap to call the Hashcode () method and the Equals () method on each key value.

Summary

Enums and Enummap are intimate little companions. When we use key values similar to the enumeration (ENUM-LIKE) structure, we should consider declaring these key values as enumerated types and using them as enummap keys.

9. Optimize the custom Hascode () method and the Equals () method

At a minimum, the hashcode () and Equals () methods are optimized in cases where enummap cannot be used. A good Hashcode () method is necessary because it prevents redundant calls to the high-overhead equals () method.

In the inheritance structure of each class, simple objects that are easy to accept are needed. Let's see how Jooq's org.jooq.Table is implemented.

The simplest and fastest way to implement the Hashcode () is as follows:

Abstracttable Basic implementation of a universal table:@Overridehashcodereturn Name.hashcode ();}

Name is the table name. We don't even have to think about schema or other table properties, because table names are usually unique in a database. And the variable name is a string that itself already caches a hashcode () value.

The comment in this code is very important, because the abstracttable inherited from Abstractquerypart is the basic implementation of any abstract syntax tree element. Ordinary abstract syntax tree elements do not have any attributes, so you cannot have any illusions about optimizing the Hashcode () method implementations. The Hashcode () method after the overlay is as follows:

Abstractquerypart a generic abstract syntax tree base implementation:@Overridehashcodereturn Create (). renderinlined (this). hashcode ();}

In other words, the entire SQL rendering workflow (rendering workflow) is triggered to compute the hash code of a common abstract syntax tree element.

The Equals () method is more interesting:

Basic implementation of the Abstracttable universal table:@Override publicboolean equals (Object) {    if (this= =) {         false;}

First, do not use the Equals () method prematurely (not only in N.O.P.E. Medium), if:

this = = argument
This "incompatible: parameter

Note: If we use instanceof to verify the compatibility type prematurely, the following conditions actually contain argument = = null.

We should be able to draw some conclusions after we have concluded the comparison of the above cases. For example, Jooq's Table.equals () method shows that it is used to compare two tables. They must have the same field name regardless of the specific implementation type. For example, the following two elements are unlikely to be the same:

Com.example.generated.Tables.MY_TABLE
Dsl.tablebyname ("my_other_table")

If we can easily determine if the incoming parameter is equal to the instance itself (this), you can discard the operation if the returned result is false. If the return result is true, we can further judge the implementation of the parent class (super). In the case where most of the objects are compared, we can end the method as early as possible to save CPU execution time.

Some objects have a higher degree of similarity than other objects.

In Jooq, most of the table instances are generated by the Jooq code generator, and the Equals () method of these instances is deeply optimized. There are dozens of other table types (derived tables (derived tables), table-valued functions (table-valued functions), array tables (arrays tables), join tables (joined tables), Pivot tables (pivot tables), common table expressions (common table expressions), and so on, maintain the basic implementation of the Equals () method.

10. Consider using set instead of a single element

Finally, there is another situation that can be applied to all languages, not just java. In addition, we have previously studied the N.O.P.E. Branching also helps with understanding from O (N3) to O (n log n).

Unfortunately, many programmers use simple, native algorithms to consider problems. They are accustomed to solving problems in a methodical manner. This is the functional programming style of the "Yes/or" form of imperative (imperative). This programming style is easy to model "larger scenes (bigger picture)" When transformed from purely imperative programming to polygon object programming to functional programming, but these styles are missing only in the SQL and R languages:

Declarative programming.

In SQL, we can declare the effect of requiring a database without considering the effect of the algorithm. The database can take the best algorithm based on data types, such as constraints (constraints), keys (key), indexes (indexes), and so on.

In theory, we initially had a basic idea of SQL and relational calculus (relational calculus). In practice, SQL vendors have implemented cost-effective optimizer CBOs (cost-based optimisers) over the last few decades. Then to the 2010 version, we finally dug out all the potential of SQL.

However, we do not need to implement SQL in set mode. All languages and libraries support sets, collections, bags, lists. The main advantage of using set is that it makes our code simple and straightforward. For example, the following wording:

Someset INTERSECT Someotherset

Instead of

Java 8 Previous notation  new HashSet ();  For (Object candidate:someset)    if (someotherset.contains (candidate)) Result.add (candidate);  Even using Java 8 is not much help Someset.stream (). Filter (Someotherset::contains). Collect (Collectors.toset ());

Some people may have different opinions about functional programming and Java 8 to help us write simpler, more concise algorithms. But that view is not necessarily right. We can convert the imperative Java 7 loop to the Java 8 Stream collection, but we're still using the same algorithm. But the SQL-style expressions are different:

Someset INTERSECT Someotherset

The above code can have 1000 different implementations on different engines. What we're working on today is a smarter way to automatically convert two sets to Enumset before invoking the INTERSECT operation. We can even perform parallel INTERSECT operations without the need to invoke the underlying Stream.parallel () method.

Summarize

In this article, we discuss the N.O.P.E. Optimization of the branch. For example, in-depth algorithm of high complexity. As a developer of Jooq, we are happy to optimize the generation of SQL.

Each query is generated with a unique StringBuilder.
The template engine actually handles characters rather than regular expressions.
Choose to use the array as much as possible, especially when iterating over the listener.
The JDBC approach is very respectful.
Wait a minute.

Jooq is at the bottom of the food chain because it is the last API that is called by our computer program when it leaves the JVM to enter the DBMS. At the bottom of the food chain means that any line that is executed in Jooq requires N x O x P time, so I want to optimize it as early as possible.

Our business logic may not be N.O.P.E. Branches are so complex. However, the underlying framework can be very complex (local SQL framework, local libraries, and so on). So we need to check with Java Mission Control or other tools to see if there is any place to optimize, as we mentioned today.

10 Simple Java Performance optimizations

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More