A bug-filled world view-Java Memory leakage
In addition to checking the code carefully, memory leakage is not a good method. First, check the gc log to check whether the memory is leaked, rather than the memory is insufficient. Memory leakage is characterized by fitting a line starting from the minimum memory used after each Full GC. If this line is a curve that increases with time, memory leakage is largely represented.
Then use jmap-histo [pid] to view the memory proportion of all your objects. You may unfortunately find that [the byte array object B occupies the vast majority. There is no better way. You can only read the code a little bit. Check whether there are any loops. Check whether the applied memory is released. Check global variables or map in a single example. Finally, you can only check all the code with suspicion.
OK. Let's talk about memory leakage (memory leak) in the form of gossip ).
I have been writing a warehouse receiving component over the past few days. The purpose of this component is to parse the transmitted data and write it into the database. Well. It sounds simple. However, the parsing format is complex, and some specific calculation formula must be used for deduplication. In addition, the parsed data needs to be input to the database in the form of thousands of records. Well, why is it designed like this? Due to historical issues... Well, the issues left over from history are very useful. In any context, if you find it difficult, you can say that it is a historical issue. In fact, it is to take a small step to quickly apply for time. Probably no one on the top hopes that you have not done this module for more than half a year. I hope that you will probably be your enemy.
Well, the above is the historical background. In this context. The warehouse receiving component on my side writes a distributed, multi-thread parsing and multi-thread batch warehouse receiving code. I thought this design was quite good. However, due to gap lock, multithreading, and distributed issues, batch writing always produces deadlocks without knowing it. To streamline the code and avoid introducing more problems, simply change the batch write to a single write.
Okay, now the memory leak problem is coming. Is memory leak not exposed because it has been deadlocked before? It is possible ~
Take a look at the gc log. Why is memory leak. Take a look. The blue line shows memory usage. The lowest point of the blue line is the high memory volume released each time the Full GC occurs. At the lowest point of GC, you can see that it is constantly increasing. Therefore, memory leakage may be a major cause.
Why is memory leakage likely? Different programs have different memory usage modes. For example, there is a large map in my program. This map will constantly fill in data. However, datasets are limited. However, before the final filling of this map, the memory usage will continue to increase. If the memory is not enough, there will still be an outofmemory error, and the gc graph will be very similar to the gc graph I gave. This does not mean that the memory is leaked. This means that the memory is insufficient. However, this must first prove that the dataset is limited, and the system's free memory can be fully put into this dataset. Otherwise, you can only take other measures to prevent insufficient memory.
Okay. I see this figure. Memory leakage. Why? Because my datasets are limited, and it's not big to make a rough calculation. Even if a cache computing policy exists, these spaces will not cause outofMemory. Why? Use jmap-histo [pid] First. Check the proportion.
num #instances #bytes class name---------------------------------------------- 1: 21571308 1163654064 [B 2: 1770275 125384008 [I 3: 1715985 120562976 [[B 4: 1715382 120535928 [Ljava.io.InputStream; 5: 3430930 109789592 [Z 6: 1715198 68607920 com.mysql.jdbc.PreparedStatement$BatchParams 7: 621372 44778960 [C 8: 59015 11608344 [Ljava.lang.Object; 9: 469551 11269224 java.lang.String 10: 335730 8057520 org.dom4j.tree.DefaultAttribute 11: 76733 2455456 org.dom4j.tree.DefaultElement 12: 49621 2376880 [Ljava.lang.String; 13: 47685 1525920 java.util.HashMap$Node 14: 41482 1327424 com.paratera.importdata.CacheKeys 15: 46753 1122072 java.lang.StringBuilder 16: 44715 1073160 java.util.ArrayList 17: 32577 1042464 java.util.concurrent.ConcurrentHashMap$Node 18: 40990 983760 java.lang.Long 19: 13911 667728 java.nio.HeapCharBuffer 20: 13832 663936 java.nio.HeapByteBuffer 21: 24729 593496 java.lang.StringBuffer 22: 13476 539040 [Ljava.util.Formatter$Flags;
Oh, Oh ~ The top ranking is byte []. I X. Actually, it cannot be the reason. Unless you have a lot of new byte [] in your code, using byte [] is basically the byte [] in the various components you call. Okay. Now the suspicious objects are extended to all their components ..
None of the top five have any problems. Until the sixth place.
This is the product. Here, byte [] byte [] [], int [], and InputStream [] rank the top few !!! Needless to say. It must be this product. This is what mysql jdbc executes addBatch. Then I took a closer look at the code. The BatchParam in addBatch () is cleared only when executeBatch () is called. If the execute () method is called, The BatchParam will not be cleared when it is written to the database. therefore, it is still caused by converting the batch data warehouse into a single data warehouse. Only change executeBatch () to execute (), although the function immediately becomes a single data entry. However, the memory leakage problem is directly introduced.
Of course, due to code encapsulation/function-based problems, addBatch () and executeBatch () are put into different functions .... So I dug myself again and buried myself. If you fall into the trap, you must have dug it wrong. The final solution is also very simple. Remove addBatch () immediately.
Okay .... I wrote more... It's late... Wash and sleep...
Oh, yes. Record the company information. If you want to join the [parallel technology] java team, send your resume to [email protected] (please indicate the source for your convenience ). the company only serves toB toG services for ultra-high computing and high-performance computing, rather than toC services. Treatment is not a problem. The problem is that you are not worth the money...