How to use BZIP2 to compress files in Linux systems

Source: Internet
Author: User
Tags bz2 memory usage

Install BZIP2
The order is as follows:

The code is as follows:
Make-f Makefile-libbz2_so &&
Make &&
Make install &&
CP Bzip2-shared/bin/bzip2 &&
Ln-s libbz2.so.1.0 libbz2.so &&
Cp-a libbz2.so*/lib &&
Rm/lib/libbz2.so &&
Ln-s.. /.. /lib/libbz2.so.1.0/usr/lib/libbz2.so &&
RM/USR/BIN/{BUNZIP2,BZCAT,BZIP2} &&
Mv/usr/bin/{bzip2recover,bzless,bzmore}/bin &&
Ln-s bzip2/bin/bunzip2 &&
Ln-s Bzip2/bin/bzcat

Although not a necessary part, it is worth mentioning that there is a patch on the tar package that makes it easier for the TAR program to compress and decompress with BZIP2/BUNZIP2. If it is pure tar, you have to use such a command: Bzcat file.tar.bz | TAR-XV or tar--use-compress-prog=bunzip2-xvf file.tar.bz2 to use bzip2 and BUNZIP2. This patch provides the-J option, and you can use the command: TAR-XVFJ file.tar.bz2 to extract a bzip2-formatted package. This patch will be applied when the TAR pack is installed later.

Brief introduction

BZIP2, Bunzip2-a block sort file compression software, v0.9.5
Bzcat-Unzip the file to standard output
Bzip2recover-Restore corrupted bzip2 files

BZIP2 uses Burrows-wheeler block sort text compression algorithm and Huffman encoding method to compress files. The compression rate is generally much better than the lz77/lz78 based compression software, and its performance is close to that of the PPM family of statistical compression software.
Command-Line arguments are intentionally designed to be very close to the GNU gzip form, but they are not exactly the same.
Bzip2 reads the file name and parameters from the command line. Each file is replaced with a compressed file named "Original filename. bz2". Each compressed file has the same modification time, permissions, and, if possible, the same owner as the original file, so these features will recover correctly at the time of decompression. In some file systems, there is no authority, owner or time concept, or a strict limit on the length of the file name, such as MSDOS, in which case the BZIP2 does not maintain the mechanism of the original filename, owner, permission, and time, and in this sense bzip2 is naïve in its handling of the filename.
Bzip2 and BUNZIP2 do not overwrite existing files by default. If you want to overwrite an existing file, specify the-F option.
If no filename is specified, bzip2 compresses the data from the standard input and writes it to the standard output. In this case, BZIP2 will refuse to write the compression result to the terminal because it is completely incomprehensible and meaningless.
BUNZIP2 (and bzip2-d) unzip all specified files. Files that are not generated by BZIP2 are ignored and a warning message is issued. The bzip2 file name is determined by the compressed file name in the following manner:

FILENAME.BZ2 decompression into filename
FILENAME.BZ decompression into filename
FILENAME.TBZ2 decompression into Filename.tar
FILENAME.TBZ decompression into Filename.tar
Anyothername decompression into Anyothername.out
If the suffix of the filename is not one of the following:. bz2,. BZ,. tbz2 or. tbz,. Bzip2 will complain that the original filename cannot be determined and the original file name plus. Out is used as the decompression file name.
In compression, if no filename is provided, bzip2 reads the data from the standard input and compresses the result to the standard output.
Bunzip2 can correctly extract files that are connected together by two or more compressed files. The results of the decompression are the corresponding uncompressed files that are joined together.
BZIP2 also supports integrity checking for compressed files that are connected together (-t option).
You can also compress or decompress files to standard output using the-C option. Multiple files can be compressed or uncompressed in this way. The output is sent to the standard output in turn. Compressing multiple files in this way generates a stream of data that contains multiple compressed files. Such data streams can only be bzip2 correctly by version 0.9.0 or later. Earlier versions of BZIP2 will stop after the first file is extracted.
Bzcat (or BZIP2-DC) unzip all the specified files to standard output.
Bzip2 can read the parameters sequentially from the environment variables BZIP2 and bzip, and process them before the command line arguments. This is a convenient way to provide default options.
Compression is always the same even if the compressed file is slightly larger than the original file. Files less than approximately 100 bytes tend to become larger after compression because there will be a 50-byte header. For random data (including the output of most compressed software), approximately 8.05 bits per byte, with a magnification rate of about 0.5%.
BZIP2 uses 32-bit CRC check code to verify that the extracted files are the same as the original files. This can be used to detect if the compressed file is corrupted and to prevent unknown defects in the bzip2 (this is a very small possibility). The probability of data corruption without detection is very small, about One-zero for each file being processed. The check is done at the time of decompression, so it only shows that something is wrong. It can help restore raw uncompressed data. You can use Bzip2recover to try to recover data from a corrupted file.
return value: Normal exit returns 0, environment problem returned 1 (file not found, illegal option, I/O error, etc.), return 2 indicates that the compressed file is corrupted and that there is an internal consistency error (for example, a defect) that caused the BZIP2 emergency exit to return 3.

Parameters
-C--stdout
Compress or decompress data to standard output.
-D--decompress
Forces the decompression. Bzip2, BUNZIP2, and Bzcat are actually the same program, and what they do will be determined according to the program name. When this option is specified, the mechanism is not considered, forcing bzip2 to decompress.
-Z--compress
Add to-D option: Forces a compression operation, regardless of which program is executed.
-T--test
Checks the integrity of the specified file, but does not decompress it. In fact, the data will be experimentally uncompressed without outputting the results.
-F--force
Force overwrite output file. Typically, bzip2 does not overwrite files that already exist. This option also forces bzip2 to break a hard connection to the file, which by default bzip2 will not do so.
-K--keep
Keep the input files (do not delete these files) while compressing or decompressing.
-S--small
Reduce memory usage during compression, decompression, and inspection. Using a modified algorithm for compression and testing, each block of data requires only 2.5 bytes. This means that any file can be uncompressed in 2300k of memory, albeit at a rate of half the usual.
In the compression, the-s will select the block length of 200k, memory consumption is limited to about 200k, the price is the compression rate will be reduced. In summary, if your machine has less memory (8 megabytes or less), you can use the-s option for all operations. See Memory management below.
-Q--quiet
Suppress unimportant warning messages. Information that is part of an I/O error and other serious events will not be suppressed.
-V--verbose
Verbose mode-Displays the compression rate of each processed file. The more-v option on the command line increases the level of detail so that bzip2 displays a number of information that is primarily used for diagnostic purposes.
-L--LICENSE-V--version
Displays the software version, license Terms and conditions.
-1 to-9
In compression, the block length is set to K, K. 900 K. has no effect on decompression. See Memory management below.
--
Consider all subsequent command-line variables as file names, even if the variables begin with a minus sign "-". You can use this option to handle file names that begin with a minus sign "-", such as bzip2---myfilename.
--repetitive-fast--repetitive-best
These options are superfluous in the 0.9.5 and above versions. In earlier versions, the two options provided some rough control over the behavior of the sorting algorithm, which was useful in some cases. The 0.9.5 and above versions have an improved algorithm that is independent of these options.

Cases:
Example A, compress

The code is as follows:

[Root@localhost ~]# bzip2-z abc.sh #压缩

The code is as follows:

[Root@localhost ~]# bzip2-kv abc.sh #压缩原文保留

Abc.sh:1.220:1, 6.557 Bits/byte, 18.04% saved, 255 in, 209 out.

The code is as follows:
[Root@localhost ~]# bzip2-9-C abc.sh >abc.bz2 #压缩原文保留

Example B, decompression

The code is as follows:
root@tnak-virtualbox:/home/tnak# BZIP2-DV abc.sh.bz2

Abc.sh.bz2:done

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.