In this paper, we will compare the performance of some commonly used compression algorithms. The results show that some algorithms still work well under extreme CPU constraints.
The comparisons in the paper are as follows:
JDK gzip--This is a high compression ratio of the slow-speed algorithm, compressed data for long-term use. The Java.util.zip.gzipinputstream/gzipoutputstream in JDK is the implementation of this algorithm.
JDK deflate--This is another algorithm in the JDK (this algorithm is used in the zip file). It differs from gzip in that you can specify the compression level of the algorithm so that you can balance the compression time with the output file size. The optional level is 0 (not compressed), and 1 (fast compression) to 9 (slow compression). The realization of it is java.util.zip.deflateroutputstream/inflaterinputstream.
LZ4 compression algorithm Java implementation-This is the algorithm introduced in this paper, the fastest compression, compared with the fastest deflate, its compression results slightly almost.
snappy--This is a very popular compression algorithm developed by Google, which aims to provide a relatively better compression algorithm for both speed and compression ratios.
Compression test
It really takes a lot of work to find out which files are suitable for data compression testing and for most Java developers (I don't want you to get a hundreds of-megabyte file to run this test). Finally, I realized that most people should be able to install JDK documents locally. So I decided to merge the entire Javadoc directory into a single file--stitching all the files. This can be done easily through the tar command, but not everyone is a Linux user, so I wrote a program to generate this file:
public class Inputgenerator {
private static final String Javadoc_path = "Your_path_to_jdk/docs";
public static final File file_path = new File ("Your_output_file_path");
The static
{
try {
if (!) File_path.exists ())
makejavadocfile ();
} catch (IOException e) {
e.printstacktrace ();
}
} private static void Makejavadocfile () throws IOException {
try (outputstream os = new Bufferedoutputstream (New Fileo Utputstream (File_path), 65536))
{
appenddir (OS, new FILE (Javadoc_path));
}
System.out.println ("Javadoc file created");
}
private static void Appenddir (final outputstream os, final File root) throws IOException {for
(File f:root.list Files ())
{
if (f.isdirectory ())
appenddir (OS, f);
else
files.copy (F.topath (), OS);
}
}
The size of the entire file on my machine is 354,509,602 bytes (338MB).
Test
At first I wanted to read the entire file into memory and then compress it. But it turns out that doing so would make it easy to run out of heap memory even on a 4G machine.
So I decided to use the OS file cache. The test framework we use here is JMH. This file will be loaded into the cache by the operating system during the warm-up phase (compressed two times before the warm-up phase). I will compress the content into the Bytearrayoutputstream stream (I know this is not the quickest way, but for each test its performance is fairly stable and it does not take time to write the compressed data to disk), so some memory space is needed to store the output.
The following is the base class for the Test class. All the tests are different in that the output stream of the compression is different, so you can reuse the test base class, just generate a stream from the Streamfactory implementation:
@OutputTimeUnit (timeunit.milliseconds)
@State (scope.thread)
@Fork (1)
@Warmup (iterations = 2)
@ Measurement (iterations = 3)
@BenchmarkMode (mode.singleshottime) public
class Testparent {
protected Path M_inputfile;
@Setup public
void Setup ()
{
m_inputfile = InputGenerator.FILE_PATH.toPath ();
}
Interface Streamfactory
{public
outputstream getstream (final OutputStream Underlyingstream) throws IOException;
}
public int Basebenchmark (final streamfactory factory) throws IOException
{
try (bytearrayoutputstream BOS = n EW Bytearrayoutputstream ((int) m_inputfile.tofile (). Length ());
OutputStream OS = factory.getstream (BOS))
{
files.copy (m_inputfile, OS);
Os.flush ();
return Bos.size ();}}
These test cases are very similar (with their source code at the end of the article), and only one example is listed here--JDK deflate's test class;
public class Jdkdeflatetest extends Testparent {
@Param ({"1", "2", "3", "4", "5", "6", "7", "8", "9"}) public
int M_LVL;
@Benchmark Public
int deflate () throws IOException
{return
Basebenchmark (new Streamfactory () {
@ Override public
outputstream getstream (OutputStream underlyingstream) throws IOException {
final deflater Deflater = new Deflater (M_LVL, true);
return new Deflateroutputstream (Underlyingstream, Deflater);}}
Test results
Size of output file
first, let's look at the size of the output file:
|| Realize | | File size (bytes) | | || gzip| | 64,200,201| | || Snappy (normal) | | 138,250,196| | || Snappy (framed) | | 101,470,113| | || LZ4 (FAST) | | 98,316,501| | || LZ4 (high) | | 82,076,909| | || Deflate (lvl=1) | | 78,369,711| | || Deflate (lvl=2) | | 75,261,711| | || Deflate (lvl=3) | | 73,240,781| | || Deflate (lvl=4) | | 68,090,059| | || Deflate (lvl=5) | | 65,699,810| | || Deflate (lvl=6) | | 64,200,191| | || Deflate (lvl=7) | | 64,013,638| | || Deflate (lvl=8) | | 63,845,758| | || Deflate (lvl=9) | | 63,839,200| |
You can see that the size of the file varies widely (from 60Mb to 131Mb). Let's look at how much time it takes for different compression methods.
Compress time
|| Realize | | Compression time (ms) | | || Snappy.framedoutput | | 2264.700| | || Snappy.normaloutput | | 2201.120| | || lz4.testfastnative | | 1056.326| | || Lz4.testfastunsafe | | 1346.835| | || Lz4.testfastsafe | | 1917.929| | || lz4.testhighnative | | 7489.958| | || Lz4.testhighunsafe | | 10306.973| | || Lz4.testhighsafe | | 14413.622| | || Deflate (lvl=1) | | 4522.644| | || Deflate (lvl=2) | | 4726.477| | || Deflate (lvl=3) | | 5081.934| | || Deflate (lvl=4) | | 6739.450| | || Deflate (lvl=5) | | 7896.572| | || Deflate (lvl=6) | | 9783.701| | || Deflate (lvl=7) | | 10731.761| | || Deflate (lvl=8) | | 14760.361| | || Deflate (lvl=9) | | 14878.364| | || GZIP | | 10351.887| |
We then combine the compression time and file size into a table to count the throughput of the algorithm to see what conclusions can be drawn.
Throughput and efficiency
|| Realize | | Time (ms) | | Uncompressed File Size | | Throughput (MB/sec) | | Compressed file size (Mb) | | || Snappy.normaloutput | | 2201.12 | | 338 | | 153.5581885586 | | 131.8454742432| | || Snappy.framedoutput | | 2264.7 | | 338 | | 149.2471409017 | | 96.7693328857| | || lz4.testfastnative | | 1056.326 | | 338 | | 319.9769768045 | | 93.7557220459| | || Lz4.testfastsafe | | 1917.929 | | 338 | | 176.2317583185 | | 93.7557220459| | || Lz4.testfastunsafe | | 1346.835 | | 338 | | 250.9587291688 | | 93.7557220459| | || lz4.testhighnative | | 7489.958 | | 338 | | 45.1270888301 | | 78.2680511475| | || Lz4.testhighsafe | | 14413.622 | | 338 | | 23.4500391366 | | 78.2680511475| | || Lz4.testhighunsafe | | 10306.973 | | 338 | | 32.7933332124 | | 78.2680511475| | || Deflate (lvl=1) | | 4522.644 | | 338 | | 74.7350443679 | | 74.7394561768| | || Deflate (lvl=2) | | 4726.477 | | 338 | | 71.5120374012 | | 71.7735290527| | || Deflate (lvl=3) | | 5081.934 | | 338 | | 66.5101120951 | | 69.8471069336| | || Deflate (lvl=4) | | 6739.45 | | 338 | | 50.1524605124 | | 64.9452209473| | || Deflate (lvl=5) | | 7896.572 | | 338 | | 42.8033835442 | | 62.6564025879| | || Deflate (lvl=6) | | 9783.701 || 338 | | 34.5472536415 | | 61.2258911133| | || Deflate (lvl=7) | | 10731.761 | | 338 | | 31.4952969974 | | 61.0446929932| | || Deflate (lvl=8) | | 14760.361 | | 338 | | 22.8991689295 | | 60.8825683594| | || Deflate (lvl=9) | | 14878.364 | | 338 | | 22.7175514727 | | 60.8730316162| | || GZIP | | 10351.887 | | 338 | | 32.651051929 | |
61.2258911133| |
As you can see, most of these implementations are very inefficient: on the Xeon e5-2650 processor, the high level deflate is about 23mb/seconds, and even gzip is only 33mb/seconds, which is probably hard to satisfy. At the same time, the fastest defalte algorithm can reach 75mb/seconds, snappy is 150mb/seconds, and LZ4 (fast, JNI implementation) can achieve incredibly 320mb/seconds!
It is clear from the table that there are two implementations that are at a disadvantage: snappy is slower than LZ4 (fast compression), and the compressed file is larger. Instead, the LZ4 (high compression ratio) is slower than the level 1 to 4 deflate, and the output file size is much larger than the level 1 deflate.
So if "real time compression" is required, I would definitely choose between LZ4 (FAST) JNI implementations or level 1 deflate. Of course, if your company is not allowed to use the third Third-party, you can only use deflate. You should also consider how much free CPU resources are available and where the data will be stored. For example, if you want to store the compressed data into the HDD, the performance of the above 100mb/seconds is no help to you (assuming your file is large enough)--HDD speed will become a bottleneck. If the same file is exported to SSD drives-even LZ4 is too slow in front of it. If you want to compress the data and then send it to the network, it is best to choose LZ4, because the deflate75mb/second compression performance compared with the network 125mb/second throughput is really a trivial one (of course, I know the network traffic and Baotou, but even if the gap is quite considerable).
Summarize
If you think the data compression is very slow, consider the LZ4 (fast) implementation, it can be text compression to achieve about 320mb/seconds-such compression speed for most applications should not be perceived.
If you are limited to the inability to use a third-party library or just want a slightly better compression scheme, consider using JDK deflate (lvl=1) for codec-the same file can compress at 75mb/seconds.
Source Code
Java Compression test source code