Performance Comparison of Different Java compression algorithms and java Performance Comparison

Last Update:2015-01-05 Source: Internet

Author: User

Tags deflater getstream xeon e5

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Performance Comparison of Different Java compression algorithms and java Performance Comparison

This article will compare the performance of several common compression algorithms. The results show that some algorithms can still work normally under extremely demanding CPU restrictions.

The comparison is as follows:

Jdk gzip-a slow algorithm with a high compression ratio. The compressed data is suitable for long-term use. Java.util.zip. GZIPInputStream/GZIPOutputStream in JDK is the implementation of this algorithm.
JDK deflate-this is another algorithm in JDK (this algorithm is used for zip files ). The difference between gzip and gzip is that you can specify the compression level of the algorithm, so that you can balance the compression time and the size of the output file. Optional values include 0 (not compressed) and 1 (Fast compression) to 9 (slow compression ). The actual condition is java.util.zip. DeflaterOutputStream/InflaterInputStream.
Java Implementation of LZ4 compression algorithm-this is the fastest compression speed in the algorithm described in this article. Compared with deflate, LZ4 compression results are slightly less. If you want to understand how it works, I suggest you read this article. It is released based on a friendly Apache 2.0 license.
Snappy-this is a very popular compression algorithm developed by Google. It aims to provide a compression algorithm with relatively better speed and compression ratio. This implementation is used for testing. It is also published according to the Apache 2.0 license.

Compression Test

Find out which data compression tests are suitable for most Java developers (I don't want you to get hundreds of megabytes of files to run this test) it took me a lot of time. Finally, I thought that most people should install JDK documents locally. Therefore, I decided to merge the entire javadoc Directory into a file-concatenate all the files. This can be easily accomplished through the tar command, but not all of them are Linux users, so I wrote a program to generate this file:

public class InputGenerator {    private static final String JAVADOC_PATH = "your_path_to_JDK/docs";    public static final File FILE_PATH = new File( "your_output_file_path" );     static    {        try {            if ( !FILE_PATH.exists() )                makeJavadocFile();        } catch (IOException e) {            e.printStackTrace();        }    }     private static void makeJavadocFile() throws IOException {        try( OutputStream os = new BufferedOutputStream( new FileOutputStream( FILE_PATH ), 65536 ) )        {            appendDir(os, new File( JAVADOC_PATH ));        }        System.out.println( "Javadoc file created" );    }     private static void appendDir( final OutputStream os, final File root ) throws IOException {        for ( File f : root.listFiles() )        {            if ( f.isDirectory() )                appendDir( os, f );            else                Files.copy(f.toPath(), os);        }    }}

The size of the entire file on my machine is 354,509,602 bytes (338 MB ).

Test

At first, I wanted to read the entire file into the memory and then compress it. However, the results show that even a 4G machine can easily exhaust the heap memory.

So I decided to use the file cache of the operating system. The test framework we use here is JMH. This file will be loaded into the cache by the operating system during the push phase (the file will be compressed twice in the push phase ). I will compress the content to the ByteArrayOutputStream stream (I know this is not the fastest way, but for each test, its performance is relatively stable, and does not need to spend time writing compressed data to the disk). Therefore, some memory space is required to store the output results.

Below is the base class of the test class. All tests differ only in the implementation of the compressed output stream. Therefore, you can reuse this test base class and generate a stream from the StreamFactory implementation:

@OutputTimeUnit(TimeUnit.MILLISECONDS)@State(Scope.Thread)@Fork(1)@Warmup(iterations = 2)@Measurement(iterations = 3)@BenchmarkMode(Mode.SingleShotTime)public class TestParent {    protected Path m_inputFile;     @Setup    public void setup()    {        m_inputFile = InputGenerator.FILE_PATH.toPath();    }     interface StreamFactory    {        public OutputStream getStream( final OutputStream underlyingStream ) throws IOException;    }     public int baseBenchmark( final StreamFactory factory ) throws IOException    {        try ( ByteArrayOutputStream bos = new ByteArrayOutputStream((int) m_inputFile.toFile().length());              OutputStream os = factory.getStream( bos ) )        {            Files.copy(m_inputFile, os);            os.flush();            return bos.size();        }    }}

These test cases are very similar (their source code is available at the end of the article). Here we only list one of them-the JDK deflate test class;

public class JdkDeflateTest extends TestParent {    @Param({"1", "2", "3", "4", "5", "6", "7", "8", "9"})    public int m_lvl;     @Benchmark    public int deflate() throws IOException    {        return baseBenchmark(new StreamFactory() {            @Override            public OutputStream getStream(OutputStream underlyingStream) throws IOException {                final Deflater deflater = new Deflater( m_lvl, true );                return new DeflaterOutputStream( underlyingStream, deflater, 512 );            }        });    }}

Size of the output file of the test result

First, let's look at the size of the output file:

| Implementation | file size (in bytes) | GZIP | 64,200,201 | Snappy (normal) | 138,250,196 | Snappy (framed) | 101,470,113 | LZ4 (fast) | 98,316,501 | LZ4 (high) | 82,076,909 | Deflate (lvl = 1) | 78,369,711 | Deflate (lvl = 2) | 75,261,711 | Deflate (lvl = 3) | 73,240,781 | Deflate (lvl = 4) | 68,090,059 | Deflate (lvl = 5) | 65,699,810 | Deflate (lvl = 6) | 64,200,191 | Deflate (lvl = 7) | 64,013,638 | Deflate (lvl = 8) | 63,845,758 | Deflate (lvl = 9) | 63,839,200 |

It can be seen that the file size is very different (from 60 Mb to 131 Mb ). Let's take a look at the time required for different compression methods.

Compression time

| Implementation | compression time (MS) | Snappy. framedOutput | 2264.700 | Snappy. normalOutput | 2201.120 | Lz4.testFastNative | 1056.326 | Lz4.testFastUnsafe | 1346.835 | Lz4.testFastSafe | 1917.929 | Lz4.testHighNative | 7489.958 | lz4.testHighUnsafe | 10306.973 | Lz4.testHighSafe | 14413.622 | deflate (lvl = 1) | 4522.644 | deflate (lvl = 2) | 4726.477 | deflate (lvl = 3) | 5081.934 | deflate (lvl = 4) | 6739.450 | deflate (lvl = 5) | 7896.572 | deflate (lvl = 6) | 9783.701 | deflate (lvl = 7) | 10731.761 | deflate (lvl = 8) | 14760.361 | deflate (lvl = 9) | 14878.364 | GZIP | 10351.887 |

We then merge the compression time and file size into a table to calculate the algorithm throughput and see what conclusions can be drawn.

Throughput and Efficiency

| Implementation | time (MS) | size of uncompressed files | throughput (Mb/s) | size of compressed files (Mb) | Snappy. normalOutput | 2201.12 | 338 | 153.5581885586 | 131.8454742432 | Snappy. framedOutput | 2264.7 | 338 | 149.2471409017 | 96.7693328857 | Lz4.testFastNative | 1056.326 | 338 | 319.9769768045 | 93.7557220459 | Lz4.testFastSafe | 1917.929 | 338 | 176.2317583185 | 93.7557220459 | Lz4.testFastUnsafe | 1346.835 | 338 | 250.9587291688 | Lz4.testHighNative | 93.7557220459 | 7489.958 | 338 | 78.2680511475 | Lz4.testHighSafe | 14413.622 | 338 | 23.4500391366 | 78.2680511475 | Lz4.testHighUnsafe | 10306.973 | 338 | 32.7933332124 | 78.2680511475 | deflate (lvl = 1) | 4522.644 | 338 | 74.7350443679 | 74.7394561768 | deflate (lvl = 2) | 4726.477 | 338 | 71.5120374012 | 71.7735290527 | deflate (lvl = 3) | 5081.934 | 338 | 66.5101120951 | 69.8471069336 | deflate (lvl = 4) | 6739.45 | 338 | 50.1524605124 | 64.9452209473 | deflate (lvl = 5) | 7896.572 | 338 | 42.8033835442 | 62.6564025879 | deflate (lvl = 6) | 9783.701 | 338 | 34.5472536415 | 61.2258911133 | deflate (lvl = 7) | 10731.761 | 338 | 31.4952969974 | 61.0446929932 | deflate (lvl = 8) | 14760.361 | 338 | 22.8991689295 | 60.8825683594 | deflate (lvl = 9) | 14878.364 | 338 | 22.7175514727 | 60.8730316162 | GZIP | 10351.887 | 338 | 32.651051929 | 61.2258911133 |

As you can see, most of the implementations are very low: On Xeon E5-2650 processors, the high-level deflate is about 23 Mb/s, and even GZIP is only 33 Mb/s, this is hardly satisfactory. At the same time, the fastest defalte algorithm can reach about 75 Mb/second, Snappy is 150 Mb/second, and LZ4 (fast, JNI implementation) can reach incredible 320 Mb/second!

It can be clearly seen from the table that there are currently two implementations at a disadvantage: Snappy is slower than LZ4 (Fast compression), and the compressed file is larger. On the contrary, LZ4 (high compression ratio) is slower than the deflate of level 1 to level 4, and the output file size is much larger than the deflate of level 1.

Therefore, if "real-time compression" is required, I will definitely choose from LZ4 (FAST) JNI implementation or Level 1 deflate. Of course, if your company does not allow third-party libraries, you can only use deflate. You also need to consider how many idle CPU resources are available and where the compressed data is stored. For example, if you want to store the compressed data to HDD, the above 100 Mb/Second performance is useless for you (assuming your file is large enough) -- HDD speed becomes a bottleneck. If the same file is output to an SSD hard disk, even LZ4 is too slow. If you want to compress the data before sending it to the network, you 'd better choose LZ4, because the compression performance of deflate75Mb/second is really dumb compared with the throughput of 125 Mb/second (of course, I know that there are still headers in the network traffic, but even if we calculate it, the gap is quite impressive ).

Summary

If you think data compression is very slow, you can consider LZ4 (FAST) implementation, it can compress text at a speed of about 320 Mb/s-This compression speed should be invisible to most applications.
If you are restricted from using a third-party library or want to have a slightly better compression solution, consider using JDK deflate (lvl = 1) codec-the compression speed of the same file can reach 75 Mb/s.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More