Use C # To process bitstream-based data,

Source: Internet
Author: User

Use C # To process bitstream-based data,
Use C # To process bitstream-based data

 

0x00 cause

Recently, we need to process some bit stream-based data. Generally, the computer processes data in bytes (8 bit). This is also true for data read using BinaryReader, even if the bool type is read, it is also a byte. However, with the help of some methods provided in the C # base class library, bit-based Data Reading is also realized. After the task is completed, I think bit-based data is quite interesting. I tried to encode common ASCII characters with 7-bit and 6-bit codes. At last, I wrote a new blog. On the one hand, I made a record, and on the other hand, I hope to help park friends who have similar needs.

0x01 reading of bit stream data

Suppose we have a byte B = 35, and we need to read the first 4 bits and the last 4 bits into two numbers respectively. What should we do. Although no ready-made methods are found in the base class library, you can use a binary string to transfer it in two steps.

1. Expression B as a binary string 00100011 first

2. Convert the 4-bit data before and after the data into numbers. The core method is:

Convert.ToInt32("0010");

This enables bit-based Data Reading.

There are many ways to convert byte into a binary string in step 1,

1. The simplest Convert. ToString (B, 2 ). If the number is less than 8, 0 will be used in the high position.

2. You can also set byte to 1, 2, 4, 8... 128 operations, from low to high.

3. You can also perform operations on byte and 32, and then shift byte left to 128 again.

The first method produces a large number of string objects. There is not much difference between the 2nd and 3 methods. The 3 method I choose depends on the feeling. The Code is as follows:

public static char[] ByteToBinString(byte b){  var result = new char[8];  for (int i = 0; i < 8; i++)  {    var temp = b & 128;    result[i] = temp == 0 ? '0' : '1';    b = (byte)(b << 1);  }
  return result;}

To convert byte [] into a binary string, you can

Public string BitReader(byte[] data){    BinString = new StringBuilder(data.Length * 8);    for (int i = 0; i < data.Length;     {         BinString.Append(ByteToBinString(data[i]));    }    return BinString.ToString();}    

In this way, when the byte [] data is obtained, it can be converted to a binary string and saved. The binary string can be read from the offset bit location and bit length, and the binary string can be converted to bool, Int16, int32. Based on this idea, you can write a BitReader class, which uses StringBuilder to store binary strings and provides the Read method to Read data from binary strings. In order to better process the data stream, add a Position record current offset based on this. When reading data using some Read methods, the Position will also move accordingly. For example, if you use ReadInt16 to read data, BitReader reads 16 bits from the current Position, converts them to Int16, and moves Position 16 bits backwards. The difference is that when you need to specify the starting offset Position when reading data, the Position does not move and is directly moved when reading from the current Position. The BitReader class code is as follows:

1 public class BitReader 2 {3 public readonly StringBuilder BinString; 4 public int Position {get; set;} 5 6 public BitReader (byte [] data) 7 {8 BinString = new StringBuilder (data. length * 8); 9 for (int I = 0; I <data. length; I ++) 10 {11 BinString. append (ByteToBinString (data [I]); 12} 13 Position = 0; 14} 15 16 public byte ReadByte (int offset) 17 {18 var bin = BinString. toString (offset, 8); 19 retu Rn Convert. toByte (bin, 2); 20} 21 22 public byte ReadByte () 23 {24 var result = ReadByte (Position); 25 Position + = 8; 26 return result; 27} 28 29 public int ReadInt (int offset, int bitLength) 30 {31 var bin = BinString. toString (offset, bitLength); 32 return Convert. toInt32 (bin, 2); 33} 34 35 public int ReadInt (int bitLength) 36 {37 var result = ReadInt (Position, bitLength); 38 Position + = bitLength; 39 retur N result; 40} 41 42 public static char [] ByteToBinString (byte B) 43 {44 var result = new char [8]; 45 for (int I = 0; I <8; I ++) 46 {47 var temp = B & 128; 48 result [I] = temp = 0? '0': '1'; 49 B = (byte) (B <1); 50} 51 return result; 52} 53}View Code

You can use BitReader to read data from byte [] buff = {35, 12}; in 4-bit format:

Var reader = new BitReader (buff); // The binary string is 0010001120.1100var num1 = reader. readInt (4); // read 4bit from the current Position to int, Position to move 4bit, result to 2, current Position = 4var num2 = reader. readInt (5, 6); // read 6 bits from a Position with the offset of 5 bits as int, Position is not moved, and the result is 48. The current Position = 4var B = reader. readBool (); // read from the current Position 1 bit is bool, Position is 1 bit, result is False, current Position = 5
0x02 bit stream Data Writing

Writing data to a bit stream is the opposite process. We use BitWriter class to store StringBuilder to save binary strings. When writing data, you need to input data and specify the number of bits required to save the data. After writing, you can convert the binary string stored in StringBuilder to byte [] in 8 bits and return the result. The core part of BitWriter is as follows:

1 public class BitWriter 2 {3 public readonly StringBuilder BinString; 4 5 public BitWriter () 6 {7 BinString = new StringBuilder (); 8} 9 10 public BitWriter (int bitLength) 11 {12 var add = 8-bitLength % 8; 13 BinString = new StringBuilder (bitLength + add); 14} 15 16 public void WriteByte (byte B, int bitLength = 8) 17 {18 var bin = Convert. toString (B, 2); 19 AppendBinString (bin, bitLength); 20} 21 22 public void WriteInt (int I, int bitLength) 23 {24 var bin = Convert. toString (I, 2); 25 AppendBinString (bin, bitLength); 26} 27 28 public void WriteChar7 (char c) 29 {30 var B = Convert. toByte (c); 31 var bin = Convert. toString (B, 2); 32 AppendBinString (bin, 7); 33} 34 35 public byte [] GetBytes () 36 {37 Check8 (); 38 var len = BinString. length/8; 39 var result = new byte [len]; 40 41 for (int I = 0; I <len; I ++) 42 {43 var bits = BinString. toString (I * 8, 8); 44 result [I] = Convert. toByte (bits, 2); 45} 46 47 return result; 48} 49 50 public string GetBinString () 51 {52 Check8 (); 53 return BinString. toString (); 54} 55 56 57 private void AppendBinString (string bin, int bitLength) 58 {59 if (bin. length> bitLength) 60 throw new Exception ("len is too short"); 61 var add = bitLength-bin. length; 62 for (int I = 0; I <add; I ++) 63 {64 BinString. append ('0'); 65} 66 BinString. append (bin); 67} 68 69 private void Check8 () 70 {71 var add = 8-BinString. length % 8; 72 for (int I = 0; I <add; I ++) 73 {74 BinString. append ("0"); 75} 76} 77}View Code

The following is a simple example:

Var writer = new BitWriter (); writer. write (); // Write 12 to 5 bits. The binary string is 01100writer. write (8, 16); // Write 8 with 16 bits. The binary string is: 0111_00000000001000var result = writer. getBytes (); // The 8bit alignment is 011000000000000001000000. // The returned result is [, 64].
0x03 7-bit encoding

The commonly used ASCII characters are 8-bit encoded, but only 7-bit characters are commonly used, and the highest bit is 0. Therefore, for an English article, we can use 7bit to recode without losing information. The encoding process is to extract the characters in the article in sequence, write the characters in 7-bit BitWriter, and finally obtain the newly encoded byte []. To be able to read data correctly, we stipulate that when reading 8-bit data is the data of the 2-era table, the next 16-bit data is the number of subsequent characters. The Code is as follows:

    public byte[] Encode(string text)    {        var len = text.Length * 7 + 24;        var writer = new BitWriter(len);        writer.WriteByte(2);        writer.WriteInt(text.Length, 16);        for (int i = 0; i < text.Length; i++)        {            var b = Convert.ToByte(text[i]);            writer.WriteByte(b, 7);        }        return writer.GetBytes();    }

When reading the same data, we first look for the start identifier, then read the number of characters, read the characters in turn according to the number of characters, the Code is as follows:

    public string Decode(byte[] data)    {        var reader = new BitReader(data);        while (reader.Remain > 8)        {            var start = reader.ReadByte();            if (start == 2)                break;        }        var len = reader.ReadInt(16);        var result = new StringBuilder(len);        for (int i = 0; i < len; i++)        {            var b = reader.ReadInt(7);            var ch = Convert.ToChar(b);            result.Append(ch);        }        return result.ToString();    }

Due to the existence of data headers, the encoded data is longer when several characters are encoded.

 

However, the more characters, the more savings after encoding.

 

0x04 6-bit character encoding

From the perspective of saving the data volume, if part of the information is allowed to be lost, such as the loss of uppercase or lowercase letters, the number of bits required for encoding can be further reduced. 26 letters + 10 digits + symbols, which can be encoded in 6bit (64) format. However, this encoding method does not support ASCII ing. We can customize the ASCII ing, for example, ing from 0 to 10 to 10 numbers, or use a custom dictionary, that is, the legendary cipher book. I often know all the domestic spy war films. The cipher book is a dictionary that remaps characters to obtain the plaintext. It is a simple single-code replacement with little encryption strength, after obtaining a sufficient amount of data samples, it is easy to crack based on statistics. Next we will try to re-encode it with 6 bits based on the custom dictionary.

Encoding Process:

Write the message header as 7-bit encoding. Then, extract the characters in the text, find the corresponding number from the dictionary, and write the number to BitWriter in 6-bit length.

    public byte[] Encode(string text)    {        text = text.ToUpper();        var len = text.Length * 6 + 24;        var writer = new BitWriter(len);        writer.WriteByte(2);        writer.WriteInt(text.Length, 16);        for (int i = 0; i < text.Length; i++)        {            var index = GetChar6Index(text[i]);            writer.WriteInt(index, 6);        }        return writer.GetBytes();    }    private int GetChar6Index(char c)    {        for (int i = 0; i < 64; i++)        {            if (Dict.Custom[i] == c)                return i;        }        return 10; //return *    }

Decoding process:

Decoding is also very simple. Find the message header, read data in 6 bits in sequence, and find the corresponding characters from the dictionary:

public string Decode(byte[] data){    var reader = new BitReader(data);    while(reader.Remain > 8)    {        var start = reader.ReadByte();        if (start == 2)            break;    }    var len = reader.ReadInt(16);    var result = new StringBuilder(len);    for (int i = 0; i < len; i++)    {        var index = reader.ReadInt(6);        var ch = Dict.Custom[index];        result.Append(ch);    }    return result.ToString();}

After a text segment is encoded with a 6-bit custom dictionary, the Data Length is shorter, but the case and line feed formats are lost.

From the perspective of encryption, you can set N custom dictionaries (assuming 10) and use M bit (for example, 4 bit) to represent the dictionary used in the message header. In this way, a dictionary is randomly selected for each encoding, and the corresponding dictionary is decoded based on 4-bit data. Changing the dictionary regularly increases the difficulty of cracking. You can try it on your own.

0x05 written at the end

The above is my experience in processing bit stream data. It is just a method that I can think of, meeting my needs. If there is a more efficient and reasonable method, I hope to give you some advice. In addition, the two examples of encoding and decoding are out of fun and cannot be used in practice. After all, there are N reliable and reliable methods for data encryption as the bandwidth is so rich.

Sample Code: https://github.com/durow/TestArea/tree/master/BitStream

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.