Common compression format comparison, and Linux compression-related instructions

Source: Internet
Author: User
Tags bz2 gz file rar

You can browse the bold section first

One, the common compression file

*.zip | Zip program compresses packaged files; (very common, but because it does not contain document name encoding information, cross-platform may be garbled)
*.rar | The WinRAR process compresses packaged files; (Common on Windows, but commercial software.) )
*.gz | Gzip program compressed files; (Linux is currently the most widely used compression format)
*.bz2 | BZIP2 files compressed by the program;
*.xz | XZ Program compressed archives;
*.tar | The TAR program is packed with data that has not been compressed;
*.tar.gz | Files packaged by the TAR program, which are also compressed by gzip (most common)
*.tar.bz2 | Files packaged by the TAR program, which are compressed by bzip2
*.tar.xz | Files packaged by the TAR program, which are compressed via XZ (Next generation compression selection)
*.7z | 7zip program compresses the packaged files.

Second, to be able to compress multi-document classification
    1. Gzip bzip2 XZ These three compression formats can only compress a single document . (In other words, the input and output of the process is a stream and does not contain the document tree information.) )
      So if you want to use them to compress multiple documents or directories, you need to use another software to package the documents you want to compress into a single document (containing the document tree information), which is tar.
      Using the TAR to archive multiple documents to be compressed, and then using the above compression instructions for the generated *.tar (or directly using the pipeline redirection), this enables multiple document compression under Linux.

    2. The 7z and zip, as well as the RAR format, both have the archive (TAR) and compression two features (that is, the format contains the document tree information) that is, they can directly compress multiple documents.

Iii. differences in the algorithms used in each format
    1. Gzip mature format, using the algorithm based on DEFLATE. (Moderate compression ratio)
    2. 7z New generation format, using the compression algorithm can be replaced by default is the use of the LZMA/LZMA2 algorithm, using AES-256 as the encryption algorithm.
    3. XZ also uses the LZMA/LZMA2 algorithm, but only one document can be compressed. (compression ratio is very high, relative time also more)
    4. Zip is also a compression format that supports multiple algorithms, and the default should be the DEFLATE algorithm used. Born earlier, there are many defects. (Cross-platform garbled, easy to hack, etc.)
    5. RAR uses the proprietary algorithm of class deflate, using AES encryption. (rar5.0 use AES-256CBC later)

But zip is widely used in Android apk format, Java jar, e-book epub, as well as GitHub, the volume of multi-document download, the reason is probably zip is very popular, so do not worry about the target platform does not unzip the software bar.

Four, how to choose the compression scheme
    1. The tar.gz is most common on Linux and has a good balance of compression and compression time. If there is any doubt, choose it, not wrong.
    2. Tar.xz is a new generation of compression format, although has a better compression ratio, compression/decompression speed is much slower than many times. Generally when the computer performance is good enough, you can choose it.
    3. The 7z and XZ are the same next-generation compression formats, which are more complex and support multi-document compression. and more suitable for cross-platform, recommended use.
    4. Zip because the cross-platform is easy to cause the document name garbled, is not recommended to use. (although there is such a flaw, but it is surprisingly widely used, as mentioned in the previous section)
    5. RAR performance is not bad, but is a commercial format, not open source, not recommended to use. ( do better is its recovery records, in the network environment is not good, easy to lead to package damage, this feature is particularly good )
    6. TAR.BZ2 is a Linux compression history, the transition product, performance is also between GZ and XZ, generally do not need to consider it.

On the whole, it is recommended to use 7z on Windows, and one of tar.gz Tar.xz 7z is recommended on Linux. In addition RAR damage is easy to repair, zip by many (to note garbled problem), can also be considered.

V. Compression-related instructions on Linux 1. TAR directive

Through the previous introduction, it can be seen that the commonly used is the tar gzip XZ, etc., if you want to compress multiple documents, you need to use tar first, and then use the pipeline redirection to gzip or XZ, it is more troublesome, and these instructions are very common. So the tar was later enhanced.
Tar was initially an archive process, and compression was done by other compression software (a process that only did one thing). Later, for convenience, the various compression instructions were integrated in a lunatic manner. So this is the only command that is introduced here (it encompasses all).
The options and parameters of tar are much more! We will only talk about a few common options, more options you can check your own man tar!

[[email protected] ~]$ tar [-z|-j|-j] [CV] [-F New file name to be created] filename ... <== packaging and compression
[[email protected] ~]$ tar [-z|-j|-j] [TV] [-F Existing tar file name] <== view file name
[[email protected] ~]$ tar [-z|-j|-j] [XV] [-F Existing tar file name] [-C directory] <== extract

Options and Parameters:

-C: Create a packaged file that can be paired with-V to see the file name (filename) that is packaged in the process
-T: Look at the contents of the package file contains what file name, focus on the "file name" is;
-X: Unpack or unzip the function, can be used with-C (uppercase) in a specific directory to unlock
It is particularly noted that the-C,-T,-X cannot appear in a sequence of instruction columns at the same time.
-Z: Compression/decompression via GZIP support: The file name is preferably . tar.gz
-J: Compression/Decompression via BZIP2 support: The file name is preferably
. tar.bz2
-J: Compression/Decompression via XZ support: The file name is preferably *.TAR.XZ
Special note,-Z,-J,-j can not appear in a series of command columns at the same time
-V: During the compression/decompression process, the file name being processed is displayed!
-F filename:-f to be processed immediately after the file name! Recommend-f write an option individually! (comparison will not forget)
-C Directory: This option is used in the decompression, to extract in a specific directory, you can use this option.

Other follow-up exercises will use the options described below:

-P (lowercase): Retains the original permissions and attributes of the backup data, often used for backup (-c) Important profiles
-P (Uppercase): preserves the absolute path, that is, allows the backup data to contain the root directory exists meaning;
--exclude=file: In the process of compression, do not package FILE!

In fact, the simplest way to use tar is simply to memorize the following:

Compression: Tar-zcv-f filename.tar.gz The name of the file or directory to be compressed
View the document tree: Tar-ztv-f filename.tar.gz
Unzip: tar-zxv-f filename.tar.gz-c the directory to unzip

The above command needs to be based on the compression format, the choice of -z -j -J options, and in fact, the suffix of the document has shown its compression format, it is unnecessary to feel redundant.
So there's this one universal compression/decompression option

-A,--auto-compress
Use archive suffix to determine the compression program.

With this, a generic decompression command is available:

Tar-axv-f file.tar.* (it applies to the above three compression formats)

Unzip only the specified document
    1. First look at the document tree to find the document name of the document you need to unzip
    2. Tar-zxv-f packing file. tar.gz file name to be unpacked
Package a directory, but do not include some files in this directory

Use the--exclude=file option (supports pattern matching for document names and can be duplicated)

Tar-zcv-f filename.tar.gz Directory--exclude=file1--exclude=func*

Only documents that are updated at a specified time in the catalog are packaged

Use the--newer-mtime= "2015/06/17" option.

Tarfile, Tarball

Tarfile Pure packaged, uncompressed tar documents
Tarball Compressed TAR document

2. zip format (Linux usually comes with, please man in detail)
    1. Compression: Zip
      • Compression directory: Zip-r filename.zip directory (R for recursive compression, which will contain this directory)
    2. Decompression: Unzip
      • Unzip to a directory: Unzip-d directory Filename.zip (-D dir means extract content into dir directory)
        • -T test the integrity of the compression file
        • -x filename excludes a document
3.7z format (requires P7zip,deepin, more please man)
    1. View directory tree: 7z l file.7z (List contents of archive)
    2. Compression: 7z a file.7z file1 directory1 (A to create a compressed file or add a document/directory to the archive, you can specify more than one document or directory to compress at a time)
    3. Decompression: 7z x File.7z-o Directory (extract to the specified directory)
    4. Test integrity: 7z t file.7z

P7zip installed, will provide 7z, 7za, 7zr Three instructions, generally directly with 7z on the line.

P.S. 7z does not save user, user group information for Linux documents and therefore cannot be used directly for Linux system backup, it is recommended to use TAR.XZ or tar.7z (that is, to package with tar first)

4. rar format (or that sentence, more please man)

RAR is non-open source format, Linux default will not contain RAR compression software, but its decompression software is open source, Deepin bring Unrar, by the way 7zip can also extract RAR documents.
If you want to create a RAR archive with Linux, you need to download the Linux version from Rarlab, (Deepin from the band) but note that the Linux version is a 40-day trial version, for long-term use, you may need to crack. (RAR key Online Search a lot of)

    1. Compression: rar A File.rar file (this is a trial)
    2. Decompression: Unrar x File.rar (this open source is free)

Actually I like the RAR repair function, do not know why 7z XZ such a new format does not add similar recorvery records. The last idea of the tarball, four or five times before the next to a complete, if the use of RAR, probably a key to repair it, can tar.gz I do not know how to repair, had to repeat the download again and again.

Vi. references
    • Archive and archive system compression, packaging and backup
    • Wikipedia
    • What is the difference between RAR tar gz zip 7z? -Know
    • Why is the Linux package all. tar.gz? To unzip it two times-know

This article allows reprint, reproduced please indicate the source: Common compression format comparison, and Linux compression-related instructions

Common compression format comparison, and Linux compression-related instructions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.