Translated official Android documents

Source: Internet
Author: User
Tags integer division
Performance Design

Android applicationsProgramMobile devices are limited by their computing capabilities, storage space, and battery life. Therefore, it must be efficient. Battery life may be a reason for your optimization, even if it seems to be running fast enough. Because of the importance of endurance to users, when the power consumption increases sharply, it means that the user will find out sooner or later because of your program.

Although this document mainly contains minor optimizations, these cannot be the key to your software's success or failure. Select an appropriateAlgorithmAnd data structure are always the first thing you should consider, but this is beyond the scope of this document.

1 Introduction

Efficient writingCodeThere are two basic principles:
No unnecessary work
Avoid memory allocation whenever possible.

2. Smart Optimization

This document is about minor optimizations to the android specification. First, make sure that you understand the code to be optimized and know how to measure the effect (good or bad) of your modifications ). Investment and Development time is limited, so wise time planning is very important.
This document also ensures that you make the best choice on algorithms and data structures while considering the potential impact of API selection. Using appropriate data structures and algorithms is more valuable than any suggestions here. Considering the impact of API versions, you can choose a better implementation.
When you optimize the android program, you will encounter a tricky problem: ensure that your program can run on different hardware platforms. Virtual machines of different versions run at different speeds on different processors. In addition, device A is not a simple device that is faster or slower than Device B and is arranged between a device and other devices. In particular, the simulator can only evaluate a small part of things that can be reflected on the device. There is also a huge difference between devices with or without JIT: code with JIT devices is sometimes not the best for devices without JIT.
If you want to know the performance of the Program on the device, you must test it on it.

3. Avoid creating unnecessary objects.

Object creation will never be free. The generation GC of each thread assigns an address pool to the zero-time object to reduce the allocation overhead, but the memory allocation is usually higher than the memory allocation.
If an object is allocated within the user interface period, a periodic garbage collection will be forced, causing a small pause gap for the user experience. The concurrent collection introduced in gingerbread may be useful, but unnecessary work should be avoided.
Therefore, avoid creating unnecessary object instances. The following are examples:
1. if a method returns a string, its return value is usually appended to a stringbuffer, changing the declaration and implementation, so that the function is directly appended to it, instead of creating a temporary zero-time variable.
2. when reading data from the input dataset, consider returning the child string of the original data instead of creating a new copy. in this way, you will create a new object, but they share the char array of the data. In exchange, even if you only use a portion of the original input, you must ensure that it remains in the memory.
A more thorough idea is to cut multi-dimensional arrays into one-dimensional arrays:
1. arrays of the int type are better than those of the integer type. By extension, two parallel int arrays are more efficient than an (INT, INT) object array. This theorem is applicable to any combination of basic data types.
2. If you need to implement the container that stores the tuples (Foo, bar) object, remember that the two parallel arrays Foo [] and bar [] are better than the arrays of one (Foo, bar) object. (Exception: when you design an API to call other code, you 'd better use the API design in exchange for a small speed increase. But try to implement it efficiently in your internal code .)
In general, try to avoid creating objects when the short time is zero. A small number of objects mean low-frequency garbage collection. This has a direct impact on the user experience.

4. Performance puzzles

The document of the previous version provides a lot of misleading ideas. Here we will clarify some of them:
1. on devices without JIT, the objects passed by calling methods use specific types instead of Interface Types (for example, passing a hashmap map is less costly than passing a map to call a method, even though the map in both cases is hashmap ). however, this is not a two-fold slow situation. In fact, the difference is only 6%, while JIT makes the two calls less efficient.
2. On devices without JIT, the field accessed to the cache is about 20% faster than the field accessed directly. In the case of JIT, field access and local access consume the same. So it is not worth optimization, unless you think it will make your code easier to read (for final, static, and static final variables)

5. Use static instead of virtual

If you do not need to access the field of an object and set the method to static, the call will be accelerated by 15% to 20%. This is also a good practice, because you can use the method declaration to know that calling this method does not need to update the status of this object.

6. Avoid internal getters/setters

In the source language like C ++, the common practice is to use getters (I = getcount () instead of directly accessing the field (I = mcount ). This is a good habit in C ++, because the compiler will inline These accesses. If you need to restrict or debug access to these domains, you can add code at any time.
In Android, this is a bad idea. Virtual method calls are much more expensive than directly accessing fields. There is a reason to use getters and setters in public interfaces according to the common object-oriented language approach, but direct access should be adopted in a class that frequently accesses its fields.
When there is no JIT, direct field access is about three times faster than getter, which is irrelevant to the call. With JIT (the overhead of directly accessing fields is the same as that of accessing local variables), it is 7 times faster. This is true in froyo, but the inline of the getter method will be improved in JIT later.

7. Use the static final modifier for Constants

Consider the following statement:

  Java code
    1. Static IntIntval =42;
    2. StaticString strval ="Hello, world! ";

 


The compiler generates a class initialization method <clinit>, which is executed when the class is used for the first time. This method saves 42 to intval and obtains the reference of the class String constant strval. When these values are referenced later, they are accessed through field searches.
We improved the implementation by using the final Keyword:

Java code
    1. Static Final IntIntval =42;
    2. Static FinalString strval ="Hello, world! ";

 
The <clinit> method is no longer required for the class because the constant enters the static field initiator in the DEX file. REFERENCE The intval code and directly call the integer value 42. when accessing strval, the "String constant" (String constant) command with a relatively low overhead is used to replace the field search. (This optimization only applies to constants of the basic data type and string type, rather than any reference type. But it is a good practice to declare constants as static final type as much as possible.

8. Use the improved for loop syntax

The improved for loop (sometimes called the "for-each" loop) can be used in collection classes and arrays that implement the iterable interface. In the collection class, the iterator prompts the interface to access the hasnext () and next () methods. In the arraylist, the Count loop iteration is three times faster (no matter whether JIT exists), but in other collection classes, the improved for loop syntax has the same efficiency as the iterator.

Here are some implementation of iterative Arrays:

Java code
  1. Static ClassFoo {
  2. IntMsplat;
  3. }
  4. Foo [] marray =...
  5. Public VoidZero (){
  6. IntSum =0;
  7. For(IntI =0; I <marray. length; ++ I ){
  8. Sum + = marray [I]. msplat;
  9. }
  10. }
  11. Public VoidOne (){
  12. IntSum =0;
  13. Foo [] localarray = marray;
  14. IntLen = localarray. length;
  15. For(IntI =0; I <Len; ++ I ){
  16. Sum + = localarray [I]. msplat;
  17. }
  18. }
  19. Public VoidTwo (){
  20. IntSum =0;
  21. For(Foo a: marray ){
  22. Sum + = A. msplat;
  23. }
  24. }

 

Zero () is the slowest, because for previous iterations in this traversal, JIT cannot optimize the overhead for obtaining the array length.
One () is faster. It puts everything into local variables to avoid searching. However, only the array length promotes performance improvement.
Two () is the fastest running on devices without JIT. For devices with JIT, it is not the same as one. He used the improved for loop syntax in jdk1.5.
Conclusion: The improved for loop is preferred, but the handwritten counting loop is considered in arraylist iteration with demanding performance.

9. In the private interior, consider replacing the private access permission with the package access permission

Consider the following definition:

Java code
  1. Public ClassFoo {
  2. Private ClassInner {
  3. VoidStuff (){
  4. Foo.This. Dostuff (FOO.This. Mvalue );
  5. }
  6. }
  7. Private IntMvalue;
  8. Public VoidRun (){
  9. Inner in =NewInner ();
  10. Mvalue =27;
  11. In. Stuff ();
  12. }
  13. Private VoidDostuff (IntValue ){
  14. System. Out. println ("Value is"+ Value );
  15. }
  16. }

 

The key to note is that the defined private internal class (FOO $ inner) directly accesses a private method and private variable in the external class. This is legal and the code will also print out the expected "value is 27 ".

But the problem is that the virtual machine deems it illegal to directly access the private members of Foo From Foo $ inner because they are two different classes, although the Java language allows internal classes to access private members of external classes, the compiler generates several comprehensive methods to bridge these gaps.

Java code
    1. /* Package */ Static IntFoo. Access $100(Foo ){
    2. ReturnFoo. mvalue;
    3. }
    4. /* Package */ Static VoidFoo. Access $200(Foo,IntValue ){
    5. Foo. dostuff (value );
    6. }

 

The internal class calls these static methods wherever the mvalue field or dostuff method needs to be accessed in the external class. This means that the Code will directly access the member variable to be accessed through the accessor method. As mentioned earlier, the access speed of accessors is slower than that of direct access. This example shows that some language conventions cause invisible performance problems.

If you use this code in high-performance hot spot, you can declare fields and members accessed by internal classes as package access permissions rather than private ones. Unfortunately, this means that these fields will be accessed by other classes in the same package, so they are not suitable for public APIs.

10. make proper use of Floating Point Numbers

Generally, in Android devices, floating point numbers are twice slower than integer types, or in JIT G1 and Nexus One with FPU and JIT (the absolute speed difference between the two devices is about 10 times)
In terms of speed, float and double are not different in modern hardware. More broadly, double is about double. In a desktop machine that has no storage space problems, double has a higher priority than float.
However, even an integer type, some chips have hardware multiplication, but division is not required. In this case, integer division and modulo calculation are implemented by software. Consider when designing a hash table or doing a lot of arithmetic operations.

11. understand and use the class library

Except for the limited selection of class library code rather than yourself, considering that the system is idle, use a handwritten assembler to replace the class library method, this may be better than the best equivalent Java code generated in JIT. A typical example is string. indexof. Dalvik is replaced by internal inline. Similarly, the system. arraycopy method is 9 times faster than the self-encoding loop of JIT in Nexus One.

12 make proper use of local methods

Local methods are not necessarily more efficient than Java. At least, the transitional associations between Java and native consume. While JIT cannot be optimized beyond this limit. When you allocate local resources (memory on the local stack, file specifiers, etc.), it is often difficult to recycle these resources in real time. At the same time, you also need to compile your code in each structure, instead of relying on JIT. Different versions may even need to be compiled for the same architecture: the local code compiled for the GI of the ARM processor cannot fully utilize the arm on Nexus One, the local code compiled for the arm on Nexus One cannot be run on the arm on G1.
When there is a local code library that you want to deploy on Android, the local code is particularly useful, not to speed up Java applications.

Conclusion

Finally, we usually consider the trade-off: first determine the problem and then optimize it. Confirm that you know the current performance; otherwise, the improvement you have made in your attempt cannot be measured.
Each claim in this document is supported by a benchmark. You can find the benchmark code in the Dalvik project of code.google.com.

The benchmark test is built using the caliper Java microbenchmark testing framework. Micro-Benchmark Testing is difficult to follow. caliper helped you complete the difficult work. Even if you are aware that the test results are not as you think (virtual machines are always optimizing your code ). We strongly recommend that you use caliper to run your own micro-benchmark test.

At the same time, you will also find that traceview is useful for analysis, but you must understand that it currently does not support JIT, which may lead to code timeout that can win on JIT. It is particularly important to make changes based on the taceview data to ensure that the code runs faster without traceview.

From: http://www.iteye.com/topic/994618

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.