GZipStream implementation of compression and the problems that arise

Source: Internet
Author: User

In the process of crawling the page, in the store crawl to the content of the page I need to first compress and then store the page, in order to use the convenience, using 2.0 GZipStream to compress.

The reference is as follows:

using System.IO;
using System.IO.Compression;
......
public static byte[] Compress(byte[] data)
{
  MemoryStream stream = new MemoryStream();
  GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress);
  gZipStream.Write(data, 0, data.Length);
  //....暂时先在注释的位置卖点关子
  return stream.ToArray();
}
public static byte[] Decompress(byte[] data)
{
  MemoryStream stream = new MemoryStream();
  GZipStream gZipStream = new GZipStream(new MemoryStream(data), CompressionMode.Decompress);
  byte[] bytes = new byte[4096];
  int n;
  while ((n = gZipStream.Read(bytes, 0, bytes.Length)) != 0)
  {
    stream.Write(bytes, 0, n);
  }
  return stream.ToArray();
}

The above code seems to have no problem (if you do not carefully do the test), but when I test the various sizes of the page, found that if the compression after the byte[] length <4k, then the problem comes out: can't decompress, decompression function in read return result is always 0. Mr. Yiduo once said in his speech "depressed ah, depressed, this is a group of depressed, it is the glory of Microsoft (author Note: I think should belong to the Microsoft Bug)".

I once doubted if the length of the bytes array in the decompress function was set long, and then the length was set very small, but after decompression, it returned 0. Really want to go and gates theory.

But luckily, I tested the 4K limit, so Google went under "GZipStream 4K", haha, in a foreign forum (http://www.dotnetmonster.com/Uwe/Forum.aspx/ Dotnet-framework/19787/problem-with-the-gzipstream-class-and-small-streams) Inside finally found the answer: the original GZipStream access to data is accessed with a block of 4 K. So at the time of compression, the stream is returned. ToArray () should first gzipstream.close () (that is, where I suspense above), because GZipStream is in the dispose of the data fully written. Do you mean injustice? I have already write, unexpectedly still want me to close only can.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.