Talk about compression in Linux

Source: Internet
Author: User
Tags bz2 gz file

1. Use and technology for compression

  1.1 Why compression is required:

        ① you have a file is too large, resulting in the normal email can not be sent out (many emails have a capacity of about 25MB per letter limit Ah!). )?

        ② have you ever had to back up some important data, but the amount of data is too large to consume a lot of your disk space?

。。。。。。

        This time, the good " file compression" technology can be used to send!

1.2 Compression principle:

           We all know 1 byte = 8 bits, and how does a computer implement memory store file data?

    Suppose a byte can be seen on the right----------> --------

    Because 1 byte = 8 bits, there are 8 spaces in each byte, and each space can be 0, 1, just a simple introduction!

    suppose to record "1" This number, consider the computer so-called binary Oh, so that 1 will be on the far right to occupy 1 bit, and the other 7 bits will be automatically filled 0! Take a closer look, but in this case, the 7 bits should be "empty"! However, in order to meet the current operating system information access, so the data will be converted to byte form to record! And some clever computer engineers use some complicated computational methods to "lose" the unused space so that the file occupies less space! This is the technology of compression!

    Simply put, you can think of him as, in fact, there are quite a lot of "space" in the file exists, not completely filled, and the "compression" of the technology is to fill these "space", so that the entire file occupies a lower capacity! However, these "compressed archives" are not directly used by our system, so, to use these compressed archives, you must restore him back to the uncompressed appearance, that is called " decompression "!!!!! The amount of disk space occupied by compressed and compressed archives can be referred to as " compression ratio " .

Common compression instructions for 2.Linux systems

2.1 Instruction Description:

      in the Linux environment, the file name of the compressed archive is mostly: " *.tar, *.tar.gz, *.tgz, *.gz, *. Z, *.bz2, *.xz"

the common compression instructions on Linux are gzip, bzip2 and the latest XZ as far as compress has retired, in order to support the common Windows Zip, Linux has already had a zip command

    However, these instructions can usually only be compressed and decompressed for a file, so that each compression and decompression of a large pile of files, it is not annoying? At this point, the so-called "packaging software, tar" is very important! The role of packaging is to package the multiple files you specify as a file meaning, and there is no compression, so that the package files can be unified compression, it is not so troublesome a file compression.

2.2 Compression Instructions

Gzip Instruction Parameter Description:

options and Parameters: - C: The compressed data output to the screen, can be processed through data flow redirection; -D: the extracted parameters; -T: can be used to verify the consistency of a compressed file ~ To see if the file has errors; -V: can show the original file/compressed file compression ratio and other information;  -#: # for the meaning of the number, which represents the compression level, 1 the fastest, but the compression ratio is the worst, 9 slowest, but the best compression!  Default is-6

2.3 gzip instruction case :
      2.3.1 Find the largest file in the/etc/directory ls-lras/etc/| Tail-n 10
          

      2.3.2 and copy Services to/tmp
cp/etc/services.

2.3.3 Compress/tmp services GZIP-V Services Description: services.gz files are compressed and the original files do not exist.
                

2.3.4 Comparing compressed and pre-compressed file ll/etc/services/tmp/services*

2.3.5 Because services this original file is a text file, so we can try to use zcat/zmore/zless to read! Zmore services.gz

2.3.6 services.gz uncompressed GZIP-DV in/tmp services.gz Description: services.gz file will be deleted after decompression

          

2.3.7 will unlock the services with the best compression ratio compression, and retain the original files gzip-9-CV Services > Services.gz

Compression level Description: gzip provides a compression level of 1~9, with compression strength incremented in turn

      

2.3.8 re-established in the services.gz, find out what line of HTTP this keyword is in? zgrep-n ' http ' services.gz

         

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Objective:

If GZIP is established to replace compress and provide a better compression ratio, then BZIP2 is to replace Gzip and provide a better compression ratio. Bzip2 is really good use of things ~ the compression ratio of this thing is even better than gzip ~ As for bzip2 use almost the same as Gzip! Look at the usage below!

2.4 bzip2 case (bzip2, Bzcat/bzmore/bzless/bzgrep)

    2.4.1 bzip2 directive parameter option description  

 Options and Parameters:  -C: The data generated by the compression process output to the screen!  -D: decompressed parameter- K: retains the original file without deleting the original file Oh! - z: Compressed parameters (default, can not add) - V: can show the original file/compressed file compression ratio and other information;   -#: Same as gzip, all in the calculation of compression ratio parameters, 9 best, -1 Fastest!

2.4.2 /tmp/services to bzip2 compressed BZIP2-V services with the new Gzip sample left
      
    
2.4.3 at this point you'll find bzip2 is better than gzip compression strength Ls-l services*
      


2.4.4 Read sample/tmp/services.bz2 file bzcat services.bz2

2.4.5 Extracting the/tmp/services.bz2 file from the sample bzip2-d services.bz2
        

2.4.6 unlock the services with the best compression ratio compression, and retain the original file bzip2-9-C Services > SERVICES.BZ2
      
    
    Description
Looking at the example above, you will find that the bzip2 and parameters are identical to gzip! Just the file name is changed from. gz to. bz2! Other usages are similar, so they are not introduced! You can also find that the compression rate of bzip2 is indeed better than gzip! However, for large-capacity files, bzip2 compression time will take longer! At least a lot longer than gzip! There is no way ~ to have more usable capacity, you have to spend the corresponding time! It's OK!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Objective:

Although Bzip2 already has a great compression ratio, it is clear that some free software developers are not satisfied, so later also launched the XZ this compression ratio of the more high-volume software! The use of this software is almost identical to that of GZIP/BZIP2! Take a look at the following cases!


XZ case ( Xzcat/xzmore/xzless/xzgrep )
 
XZ Options and descriptions
    
Options and Parameters:
    - D: just unzip!     -T: Test the integrity of the archive to see if there is an error-L: Lists information about the compressed file-K: Keep the original file does not delete ~-C: The same is the data from the screen output meaning! -#: The same, there is a better compression ratio of the meaning!

 To compress       the/tmp/services that had just been left by bzip2 through XZ. Xz-v Services  (The compression ratio is visible in the following effect, further declines in capacity )
    


list the information for this archive, and then read out the contents of the compressed file xz-l services.xz

2.5.4 Viewing    Compressed archive contents  Xzcat services.xz (instructions are very similar, not all of them)
  
2.5.5 Decompression xz-d services.xz

Keep the file name of the original file, and create a compressed file! Xz-k Services



Summary of compression instructions:
The following is a set of time data:
"Time [GZIP|BZIP2|XZ]-C Services > services. [GZ|BZ2|XZ] "To perform the results of the operation, the results found that the execution time of the three instructions are: 0.019 s, 0.042s, 0.261s, see the last number!" 10 times times worse.

Through the case we found that the higher the compression ratio time is longer, although the XZ compression strength is much higher than the GIZP, but the xz spend time is too long, so if you do not think the time cost is your consideration then use XZ will be better, if time is your important cost considerations, I am afraid gzip is more suitable for compression software!


Please explain if you have any questions!

Talk about compression in Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.