Performance comparison of different compression algorithms in Java

Source: Internet
Author: User
Tags deflater getstream xeon e5

This article will compare the performance of several commonly used compression algorithms. The results show that some algorithms still work properly under extreme CPU constraints.

The comparisons in this article are calculated as follows:

    • JDK gzip--This is a compression ratio of the slow algorithm, the compressed data for long-term use. The Java.util.zip.gzipinputstream/gzipoutputstream in the JDK is the implementation of this algorithm.
    • JDK deflate--This is another algorithm in the JDK (the zip file is used for this algorithm). It differs from gzip in that you can specify the compression level of the algorithm so that you can balance the compression time with the output file size. The optional levels are 0 (not compressed), and 1 (fast compression) to 9 (slow compression). Its implementation is java.util.zip.deflateroutputstream/inflaterinputstream.
    • Java implementation of the LZ4 compression algorithm-This is the fastest compression in the algorithm described in this article, and its compression results are slightly less than that of the fastest deflate. If you want to figure out how it works, I suggest you read this article. It was released based on the Friendly Apache 2.0 license.
    • snappy--This is a very popular compression algorithm developed by Google, which is designed to provide a compression algorithm that is relatively superior in both speed and compression ratios. What I'm testing for is this implementation. It is also issued under the Apache 2.0 license.
Compression test

It takes a lot of effort to find out which files are suitable for both data compression testing and for most Java developers (I don't want you to have a hundreds of trillion file to run this test). Finally, I thought that most people should have JDK files installed locally. So I decided to merge the Javadoc directory into one file--stitching all the files. This can be done easily with the tar command, but not everyone is a Linux user, so I wrote a program to generate this file:

PublicClassInputgenerator{PrivateStaticFinalStringJavadoc_path="Your_path_to_jdk/docs";PublicStaticFinalFileFile_path=NewFile("Your_output_file_path");Static{Try{If(!File_path.Exists())Makejavadocfile();}Catch(IOExceptionE){E.Printstacktrace();}}PrivateStaticvoidMakejavadocfile()ThrowsIOException{Try(OutputStreamOs=NewBufferedoutputstream(NewFileOutputStream(File_path),65536)){Appenddir(Os,NewFile(Javadoc_path));}System.Out.println("Javadoc file created");}PrivateStaticvoidAppenddir(FinalOutputStreamOs,FinalFileRoot)ThrowsIOException{for  (file f : root listfiles () ) {if  (f. Isdirectory () ) appenddir (os< Span class= "O", f else files. Copy (f. Topath (), os} }          /span>                

The size of the entire file on my machine is 354,509,602 bytes (338MB).

Test

At first I wanted to read the entire file into memory and then compress it. But it turns out that it's easy to run out of heap memory even on a 4G machine.

So I decided to use the operating system's file cache. The test framework we use here is JMH. This file is loaded into the cache by the operating system during the warm-up phase (it is compressed two times during the warm-up phase). I'll compress the content into the Bytearrayoutputstream stream (I know it's not the quickest method, but it's fairly stable for each test, and it doesn't take time to write the compressed data to disk), so it also needs some memory space to store the output.

The following is the base class for the Test class. All of the tests differ only in the implementation of the compressed output stream, so it is possible to reuse this test base class by simply generating a stream from the Streamfactory implementation:

@OutputTimeUnit(Timeunit.MILLISECONDS)@State(Scope.Thread)@Fork(1)@Warmup(Iterations=2)@Measurement(Iterations=3)@BenchmarkMode(Mode.Singleshottime)PublicClassTestparent{ProtectedPathM_inputfile;@SetupPublicvoidSetup(){M_inputfile=Inputgenerator.File_path.Topath();}InterfaceStreamfactory{PublicOutputStreamGetStream(FinalOutputStreamUnderlyingstream)ThrowsIOException;}PublicIntBasebenchmark(FinalStreamfactoryFactory)ThrowsIOException{Try(BytearrayoutputstreamBos=NewBytearrayoutputstream((Int)m_inputfile. (). outputstream os = factory.getstream (bos )  { span class= "n" >files. (m_inputfileosos. (); return bos. Size (); } }          /span>                 

These test cases are very similar (at the end of the article with their source code), here is only one example of--JDK deflate test class;

PublicClassJdkdeflatetestExtendsTestparent{@Param({"1","2","3","4","5","6","7","8","9"})PublicIntM_lvl;@BenchmarkPublicIntDeflate()ThrowsIOException{ReturnBasebenchmark(NewStreamfactory(){@OverridePublicoutputstream getstream (outputstream underlyingstream) throws ioexception {final deflater deflater = new deflater (m_lvl); return new deflateroutputstream ( underlyingstreamdeflater512 } }             /span>                
The size of the test result output file

Let's start by looking at the size of the output file:

|| implementation | | File size (bytes) | | || gzip| | 64,200,201| | || Snappy (normal) | | 138,250,196| | || Snappy (framed) | | 101,470,113| | || LZ4 (FAST) | | 98,316,501| | || LZ4 (high) | | 82,076,909| | || Deflate (lvl=1) | | 78,369,711| | || Deflate (lvl=2) | | 75,261,711| | || Deflate (lvl=3) | | 73,240,781| | || Deflate (lvl=4) | | 68,090,059| | || Deflate (lvl=5) | | 65,699,810| | || Deflate (lvl=6) | | 64,200,191| | || Deflate (lvl=7) | | 64,013,638| | || Deflate (lvl=8) | | 63,845,758| | || Deflate (lvl=9) | | 63,839,200| |

You can see that the size of the file varies widely (from 60Mb to 131Mb). Let's look at how much time is required for different compression methods.

Compression time

|| implementation | | Compression time (ms) | | || Snappy.framedoutput | | 2264.700| | || Snappy.normaloutput | | 2201.120| | || lz4.testfastnative | | 1056.326| | || Lz4.testfastunsafe | | 1346.835| | || Lz4.testfastsafe | | 1917.929| | || lz4.testhighnative | | 7489.958| | || Lz4.testhighunsafe | | 10306.973| | || Lz4.testhighsafe | | 14413.622| | || Deflate (lvl=1) | | 4522.644| | || Deflate (lvl=2) | | 4726.477| | || Deflate (lvl=3) | | 5081.934| | || Deflate (lvl=4) | | 6739.450| | || Deflate (lvl=5) | | 7896.572| | || Deflate (lvl=6) | | 9783.701| | || Deflate (lvl=7) | | 10731.761| | || Deflate (lvl=8) | | 14760.361| | || Deflate (lvl=9) | | 14878.364| | || GZIP | | 10351.887| |

We then combine the compression time and the file size into a single table to count the throughput of the algorithm and see what conclusions can be drawn.

Throughput and efficiency

|| implementation | | Time (ms) | | Uncompressed File Size | | Throughput (MB/s) | | Compressed file size (Mb) | | || Snappy.normaloutput | | 2201.12 | | 338 | | 153.5581885586 | | 131.8454742432| | || Snappy.framedoutput | | 2264.7 | | 338 | | 149.2471409017 | | 96.7693328857| | || lz4.testfastnative | | 1056.326 | | 338 | | 319.9769768045 | | 93.7557220459| | || Lz4.testfastsafe | | 1917.929 | | 338 | | 176.2317583185 | | 93.7557220459| | || Lz4.testfastunsafe | | 1346.835 | | 338 | | 250.9587291688 | | 93.7557220459| | || lz4.testhighnative | | 7489.958 | | 338 | | 45.1270888301 | | 78.2680511475| | || Lz4.testhighsafe | | 14413.622 | | 338 | | 23.4500391366 | | 78.2680511475| | || Lz4.testhighunsafe | | 10306.973 | | 338 | | 32.7933332124 | | 78.2680511475| | || Deflate (lvl=1) | | 4522.644 | | 338 | | 74.7350443679 | | 74.7394561768| | || Deflate (lvl=2) | | 4726.477 | | 338 | | 71.5120374012 | | 71.7735290527| | || Deflate (lvl=3) | | 5081.934 | | 338 | | 66.5101120951 | | 69.8471069336| | || Deflate (lvl=4) | | 6739.45 | | 338 | | 50.1524605124 | | 64.9452209473| | || Deflate (lvl=5) | | 7896.572 | | 338 | | 42.8033835442 | | 62.6564025879| | || Deflate (lvl=6) | | 9783.701|| 338 | | 34.5472536415 | | 61.2258911133| | || Deflate (lvl=7) | | 10731.761 | | 338 | | 31.4952969974 | | 61.0446929932| | || Deflate (lvl=8) | | 14760.361 | | 338 | | 22.8991689295 | | 60.8825683594| | || Deflate (lvl=9) | | 14878.364 | | 338 | | 22.7175514727 | | 60.8730316162| | || GZIP | | 10351.887 | | 338 | | 32.651051929 | | 61.2258911133| |

As you can see, most of these implementations are very inefficient: on Xeon e5-2650 processors, the high-level deflate is about 23mb/seconds, and even gzip is only 33mb/seconds, which is probably hardly satisfying. At the same time, the fastest defalte algorithm can be 75mb/seconds, snappy is 150mb/seconds, and LZ4 (fast, JNI implementation) can achieve incredible 320mb/seconds!

It is clear from the table that two implementations are at a disadvantage: snappy is slower than LZ4 (fast compression), and the compressed file is larger. Conversely, the LZ4 (high compression ratio) is slower than the level 1 to 4 deflate, and the output file size is much larger than the deflate of level 1.

So if you need to do "real-time compression," I would definitely choose between LZ4 (FAST) JNI implementations or level 1 deflate. Of course, if your company is not allowed to use the third party libraries, you can only use deflate. You also need to consider how much free CPU resources are available and where the compressed data will be stored. For example, if you want to store the compressed data to the HDD, then the performance of the 100mb/second is not helpful to you (assuming your file is large enough)--HDD speed will become a bottleneck. If the same file is output to an SSD drive-even LZ4 is too slow in front of it. If you want to compress the data before sending to the network, it is best to choose LZ4, because deflate75mb/second compression performance compared with the network 125mb/second throughput is really dwarfed (of course, I know the network traffic and Baotou, but even if it is the gap is quite considerable).

Summarize
    • If you think that data compression is very slow, consider the LZ4 (fast) implementation, which can compress text for approximately 320mb/seconds--a compression rate that should not be perceptible to most applications.

    • If you are restricted from using a third-party library or just want a slightly better compression scheme, consider using JDK deflate (lvl=1) to encode and decode it-the same file will compress at a speed of 75mb/seconds.

Performance comparison of different compression algorithms in Java

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.