Java Performance Tuning

Source: Internet
Author: User

  

A GUI program written in Java, the function is to analyze the log, it will be a number of the same format of the text log file read into the memory analysis processing, and then merge the results output.

The number of files dozens of, the file size of a few KB, logging thousands of or so, this tool can be smooth processing, easy to meet the needs.

However, because of the logging scheme tuning, the logging type range has been extended from warn, the error level to the even info, debug-level logs to be logged, resulting in a spike in log volume, increased log files to hundreds of for a fixed time range, and a single size increase to several m, The log records reached hundreds of thousands of, then the tool ran as slow as a snail, a few 10 seconds to get results.

To continue using this tool, you naturally need to optimize and to optimize you need to locate where the performance bottleneck is. This tool is not complex, no complex calculations, no unnecessary running loss, the most likely cause of the problem is the file content read I/O overhead, but in Java's ability to read from hundreds of files hundreds of m content, it is not possible to take a few 10 seconds time. And to speed up, I used multithreading, using the thread pool to assign a thread to each log file's read and processing, so I started to analyze the problem here, and intuitively, threading is always easy to relate to performance issues.

Because the use

Executors.newcachedthreadpool ()

Create the thread pool I used, and use this thread pool too scrawled, it has no size limit, for each log file processing allocated a thread, if there are 100 files to be processed, then the thread pool will create 100 threads, and my machine CPU only 4 core, obviously these 100 threads can not be executed concurrently , and it also increases the overhead of switching between threads, so it is suspected that this is one of the causes of performance problems.

From the relevant network data, the best number of threads can be determined by the following formula

Number of best threads = ((thread waiting time + thread CPU time)/thread CPU Time) * Number of CPUs

Obviously, all the variables in the formula are not known except for the number of CPUs, so I used another simple rough formula.

Thread data = 2 * CPU cores + 1

This is the calculation formula for I/O intensive programs. Because this tool needs to read disk files frequently, I position him as I/O intensive

Executorserviceexecutorservice=executors.newfixedthreadpool (9);

By using a limited number of thread pools to improve programs, the speed of the program seems to improve, but it is a drop in the bucket to make the program run at an acceptable speed. Of course, I did not have much hope for this improvement, because the switching of hundreds of threads could not take dozens of seconds.

I know that there is a problem with the program, but for a problem-free program to process these log files under normal circumstances does not have a clear concept, so there is not enough confidence to determine the program problem module, there is no evidence of speculation, so can not quickly locate the problem.

It may be intuitive to hook performance to concurrency and threading, so the next optimizations are still on this topic.

I decided to separate the reading and processing of the log file, read the file using a thread, log data processing using a thread, read the contents of the file into a blocking queue, the data processing response thread from the blocking queue to read the data for processing, which is a typical production consumption pattern.

A lot of effort to achieve this model, confident that the speed of the program will be improved, however, the unexpected program runs faster than not improve, and more slowly than before, and with the phenomenon of death, I am frustrated, I have to admit that performance bottlenecks are not concurrent processing, Optimization in this direction will never solve the problem, and the concurrency skills on the tall are useless here.

I am conscious of the need to narrow down the problem area and pinpoint the problem code to get rid of the bone.

Although I suspect that performance bottlenecks are on file reads, there is no clear evidence that a clear data metric is required.

I add debug code to the key parts of the program, calculate the time required to perform each link, and from the data it is clear that most of the time is actually occupied by the file read.

I locate the code closest to the file, a way to copy it from another project.

Publicstaticstringreadstringfromfile (filefile) {

try{

if (File.isfile () file.exists ()) {

Inputstreamreaderread=newinputstreamreader (Newfileinputstream (file), UTF-8);

Bufferedreaderbufferedreader=newbufferedreader (read);

Stringlinetxt=bufferedreader.readline ();

stringresult=;

while (Linetxt!=null) {

Result+=linetxt;

Linetxt=bufferedreader.readline ();

}

Returnresult;

}

}catch (Ioexceptione) {

E.printstacktrace ();

}

Returnnull;

}

There seems to be nothing wrong with this approach, and using BufferedReader to improve performance, I'm baffled and start speculating about the problem.

The program is read-by-line content, the question will not appear here?

Is it faster to read the contents of the entire file at once?

Just thinking, the eyes are drawn by a line of code

Result+=linetxt;

And then there was the idea that a programming concept flashed in my mind: the string type is thread-safe, the state is immutable, and each operation copies a copy of itself, so performance is low, and in the case of frequent concatenation of strings, You should replace string with StringBuffer or StringBuilder.

Hundreds of log files, hundreds of thousands of log records, even if one line of records, the Java virtual machine also performed hundreds of thousands of copies of the string, slow is inevitable.

It's easy to determine where the problem is, and I'm using the StringBuilder class instead of string for strings processing

Publicstaticstringreadstringfromfile (filefile) {

try{

if (File.isfile () file.exists ()) {

Inputstreamreaderread=newinputstreamreader (Newfileinputstream (file), UTF-8);

Bufferedreaderbufferedreader=newbufferedreader (read);

Stringlinetxt=bufferedreader.readline ();

Stringbuilderresult=newstringbuilder ();

while (Linetxt!=null) {

Result.append (Linetxt);

Linetxt=bufferedreader.readline ();

}

Returnresult.tostring ();

}

}catch (Ioexceptione) {

E.printstacktrace ();

}

Returnnull;

}

The program is running at a staggering speed, with a few 10 seconds before the result, now less than a second to complete, the problem solved. As for why not StringBuffer is because StringBuffer is StringBuffer's concurrent security version, performance is slightly less than stringbuider, and there is no concurrency processing problem, Therefore, it is natural to choose StringBuilder with better performance.

Only three lines of code can solve the problem that I have been tossing for a while, the reason lies in the following reasons

①: Preconceived ideas, always put performance problems and concurrent hooks, the use of multi-threaded processing optimization

②: Intuitively judge the cause of the problem, there is no exact basis, resulting in the optimization direction error

In many cases, program performance issues are not related to concurrency, when a program is slow, and not because more than a few threads will change quickly, the real cause of the problem is often unexpected, and most likely is very low-level problems, to eliminate these problems, to ensure code health, if the performance problem persists, Then use concurrency to solve the problem later.

Furthermore, the performance of the optimizer must determine the code block that caused the performance problem, not the intuitive judgment, whether the person can not solve the problem, but also a waste of time.

?

Java Performance Tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.