Tcpip packet encoding parsing (Chunk and gzip) _ space of jialy

Tcpip packet encoding parsing (Chunk and gzip) _ space of jialy _ Baidu Space

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tcpip packet encoding parsing (Chunk and gzip) _ space of jialy _ Baidu Space

After Chunk and gzip extract the message-body data in the HTTP message, the next step is to process the data. My method is to save the data as a file and then process it properly.

Most of the object data transmitted over HTTP is compressed and transmitted. Therefore, the data we get is not directly HTML text files, but compressed/encoded, so here we have to have a decoding process. There are two types of encoding involved in http: one class of compression encoding, mainly about object data compression, which aims to compress the object data volume; the other class of encoding is about transmission encoding, it is mainly based on the security and reliability of data transmission. The preceding two types of encoding HTTP protocol are described in the message. In the header of the HTTP Response Message, when transfer-encoding: chunked exists, the data is processed in blocks, when content-encoding: gzip, the object data is compressed according to the gzip specification. Refer to the HTTP protocol description http://www.w3.org/Protocols/rfc2616/rfc2616.html

In addition, it should be noted that gzip and chunked are implemented first (or how do they coordinate the work )? The answer is 1. First Use gzip to compress the original object data (here it is an HTML text file), 2. Then use chunked to block the compressed data.

The following is how to implement Chunk and gzip decoding, reference from http://nblive99.spaces.live.com/blog/cns! 74a0072781b23dfb! 130. Entry

Some encoding problems were encountered when reorganizing the TCPIP protocol stack data packets, mainly Chunk and gzip encoding.First look at Chunk:Definition of chunked in rfc2616:
Chunked-Body = * chunk
Last-chunk
Trailer
CRLF

Chunk = chunk-size [chunk-extension] CRLF
Chunk-data CRLF
Chunk-size = 1 * hex
Last-chunk = 1 * ("0") [chunk-extension] CRLF

Chunk-extension = * (";" chunk-ext-name ["=" chunk-ext-Val])
Chunk-ext-name = token
Chunk-ext-val = token | quoted-string
Chunk-Data = chunk-size (octet)
Trailer = * (entity-header CRLF)

The following are the pseudo decoding process Code :
Length: = 0 // The length of the decoded data body.
Read chunk-size, chunk-extension (if any) and CRLF // size of the first read Block
While (chunk-size> 0) {// keep repeating until the size of the read block is 0
Read chunk-data and CRLF // read the block data body and press enter to finish
Append Chunk-data to entity-body // Add the block data body to the decoded Object Data
Length: = Length + chunk-size // update the decoded object Length
Read chunk-size and CRLF // read the new block size
}
Read entity-header // The following Code reads all header tags
While (entity-header not empty ){
Append entity-header to existing header fields
Read entity-Header
}
Content-Length: = length // Add content length to the header
Remove "chunked" from transfer-encoding // The logic for removing the transfer-encoding pseudo code from the header flag is a bit confusing. After studying the logic, I wrote the C language decoding code: /// // char * unchunk (char * filename)
{
Char cmdbuf [1024];
/* If (strstr (filename, ". Trunk") = 0)
{
Strcat (filename, ". Trunk ");
Memset (cmdbuf, 0x0, sizeof (tmpfile ));
Sprintf (cmdbuf, "Mv % S % s", chunkfile, filename );
System (cmdbuf );
} */File * fp = fopen (filename, "AB + ");
Char newfile [128];
Memset (newfile, 0x0, sizeof (tmpfile ));
Strcpy (newfile, filename );
Char * PTR = strstr (newfile, ". Trunk ");
* PTR = 0;
Printf ("% s \ n", newfile); file * fp_unchunk = fopen (newfile, "WB +"); char chunk_head [8];
Memset (chunk_head, 0x0, sizeof (chunk_head ));
Fgets (chunk_head, sizeof (chunk_head), FP );
Char * P = strstr (chunk_head, "\ r \ n"); If (P)
{
Int chunk_size = strtol (chunk_head, null, 16 );
Char * chunk_data;
While (chunk_size> 0)
{
Chunk_data = (char *) malloc (chunk_size );
Memset (chunk_data, 0x0, chunk_size );
Fread (chunk_data, chunk_size, 1, FP );
Fwrite (chunk_data, chunk_size, 1, fp_unchunk); fseek (FP, 2, seek_cur); // reread chunk head
Memset (chunk_head, 0x0, sizeof (chunk_head ));
Fgets (chunk_head, sizeof (chunk_head), FP );
Char * P = strstr (chunk_head, "\ r \ n ");
If (P)
{
Chunk_size = strtol (chunk_head, null, 16 );
Free (chunk_data );
}
Else
Break;
} // Remove old file
Memset (cmdbuf, 0x0, sizeof (cmdbuf ));
Sprintf (cmdbuf, "RM % s", filename );
System (cmdbuf); fclose (fp_unchunk );
Fclose (FP); Return newfile;
}
Else
{
Fclose (fp_unchunk );
Fclose (FP); Return filename;
}
}////////////////////////////////////// Next, let's look at the decoding of gzip. The decoding of Gzip is relatively simpler. There are two methods to achieve this: one is to directly call the system gzip command for decompression without technical content; the other is to use the zlib library for higher versatility, however, to use the zlib library, the development process is a little complicated. The following shows the C code for extracting the GZIP file: //// // call the system gzip command code (no difficulty) void ungzip (char * filename)
{
Char cmdbuf [1024]; If (strstr (filename, ". GZ") = 0)
{
Memset (cmdbuf, 0x0, sizeof (cmdbuf ));
Sprintf (cmdbuf, "Mv % s g0s.gz", filename, filename );
System (cmdbuf );
} Memset (cmdbuf, 0x0, sizeof (cmdbuf ));
Sprintf (cmdbuf, "gzip-D % s", filename );
System (cmdbuf );
} // Use the zlib library code # include "zlib/zlib. H" Void uncompresstorrent (char * SRC, char * DST)
{
Gzfile * gzfp = gzopen (SRC, "rb ");
File * fp = fopen (DST, "WB ");
Char in [chunk];
Int retlen =-1; while (0! = (Retlen = gzread (gzfp, In, Chunk )))
{
Fwrite (in, 1, retlen, FP );
}
Gzclose (gzfp );
Fclose (FP );
} (Compile with the-lzlib-LZ parameter) /////////////////////////////////

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Tcpip packet encoding parsing (Chunk and gzip) _ space of jialy _ Baidu Space

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Tcpip packet encoding parsing (Chunk and gzip) _ space of jialy _ Baidu Space

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support