Gzip was first created by Jean-loup Gailly and Mark Adler for file compression in UNIX systems. We often use the file suffix. gz in Linux, which is the gzip format. Today has become a very popular data compression format, or a file format, used on the Internet. The gzip encoding on the HTTP protocol is a technique used to improve the performance of Web applications. Large-volume Web sites often use gzip compression technology to allow users to experience faster speeds. Gzip itself is only a file format, which typically uses deflate data format, while deflate uses LZ77 compression algorithms to compress data. The gzip file consists of 1 to multiple "blocks", and in fact it is usually only 1 pieces. Each block contains the header, the data, and the tail three parts. The general outline of the block is as follows: +---+---+---+---+---+---+---+---+---+---+========//========+===========//==========+---+---+---+---+---+---+--- +---+
| id1| id2| cm| Flg| Mtime | xfl| os| Extra header Fields | Compressed Data | CRC32 | Isize | +---+---+---+---+---+---+---+---+---+---+========//========+===========//==========+---+---+---+---+---+---+--- +---+ 1. The first partID1 and ID2:1 bytes each. Fixed value, ID1 = to (0x1F), ID2 = 139 (0X8B), indicating gzip format. Cm:1 bytes. Compression method. At present there is only one: CM = 8, indicating the deflate method. Flg:1 bytes. Sign.Bit 0 Ftext-Indicates text data Bit 1 FHCRC-Indicates existence of CRC16 header checksum field Bit 2 Fextra-Indicates the existence of an option field Bit 3 FNAME-Indicates the existence of the original file name segment Bit 4 Fcomment-Indicates that there are annotation fields Bit 5-7 preserves mtime:4 bytes. Change the time. Uinx format. Xfl:1 bytes. The attached flag. When cm = 8 o'clock, XFL = 2-Maximum compressed but slowest algorithm; XFL = 4-the fastest but least compressed algorithm os:1 bytes. The operating system, exactly, should be the file system. There are the following definitions: 0-fat File System (MS-DOS, OS/2, Nt/win32) 1-amiga 2-vms/openvms 3-unix 4-vm/cms 5-atari TOS 6-hpfs file System (OS/2, NT) 7-macintosh 8-z-system 9-cp/m 10-tops-20 11-ntfs file System (NT) 12-qdos 13-acorn Riscos 255-Unknown Extra header field: (If FLG.) Fextra = 1) +---+---+---+---+===============//================+
| si1| si2| Xlen | Options with length of Xlen bytes |
+---+---+---+---+===============//================+ (If FLG.) FNAME = 1) +=======================//========================+
| Original filename (null-terminated) |
+=======================//========================+ (If FLG.) Fcomment = 1) +=======================//========================+ | Comment Text (use only iso-8859-1 characters, ending in null) | +=======================//========================+ (If FLG.) FHCRC = 1) +---+---+ | CRC16 | +---+---+ When there are additional options available, SI1 and SI2 indicate that the option Id,xlen indicates the number of bytes that can be selected. such as SI1 = 0x41 (' A '), SI2 = 0x70 (' P '), which indicates that an option is an additional data in the Apollo file format. 2. The data section Deflate data format that contains a series of child data blocks. The outline of the sub blocks is as follows: +......+......+......+=============//============+ | bfinal| Btype | Data | +......+......+......+=============//============+ Bfinal:1 bit. 0-There are subsequent child blocks, 1-the child block is the last piece. Btype:2 bit. 00-Uncompressed, 01-Static Huffman coding compression, 10-Dynamic Huffman encoding compression; 11-retention. For a variety of situations, refer to the RFC documentation listed later. 3. Tail part Crc32:4 byte. 32-bit checksum of raw (uncompressed) data. Isize:4 bytes. The original (uncompressed) data has a low 32-bit length. The byte order in gzip is the LSB, that is, the Little-endian, as opposed to the zlib. Gzip and zlib have a deep source. For more detailed instructions on zlib, gzip and deflate, refer to RFC 1950-1952. Other references can also be found in these documents. Gzip has become an integral part of GNU project and its official site is www.gzip.org. Here you can download to the GZIP source code. The latest version is 1.2.4, as well as the beta version of 1.3.3. |