Information encoding: basic integer

Source: Internet
Author: User

First, let's consider how simple data types, such as int, long, char, and String, are sent and received through sockets. We have learned from the previous chapter that byte information can be written to an OutputStream instance through sockets during information transmission (this instance has been associated with a Socket ), or encapsulate it into an initrampacket instance (this instance will be sent by the initramsocket ). However, the only data types that these operations can process are byte and byte arrays. As a strongly typed language, Java needs to explicitly convert other data types (such as int and String) into byte arrays. Fortunately, the built-in Java tools can help us complete these conversions. In the TCPEchoClient. java sample program before 2nd, we have seen the getBytes () method of the String class, which is the standard way to convert characters in a Sring instance into bytes. Before considering the details of data type conversion, let's take a look at the presentation methods of most basic data types. As we can see in the basic integer, TCP and UDP sockets enable us to send and receive byte sequences (arrays), that is, integers ranging from 0. Using this function, we can encode the basic integer data with a larger value. However, the sender and receiver must first reach a consensus in some aspects. First, the size of each integer to be transmitted ). For example, in a Java program, the int data type is represented by 32 bits. Therefore, we can use 4 bytes to transmit arbitrary int variables or constants. The short data type is represented by 16 bits, only two bytes are required for short data transmission. Similarly, 8 bytes are required for 64-bit long data transmission. Next we will consider how to encode a sequence containing four integers: A byte type, a short type, an int type, and a long type, transmit data from the sender to the receiver in this order. We need a total of 15 bytes: the first byte stores byte data, the next two bytes stores short data, and the last four bytes stores int data, the last 8 bytes are used to store long data. Have we prepared for in-depth research? Not necessarily. For data types that require more than one byte, we must know the sending sequence of these bytes. Obviously, there are two options: from the right side of the integer, from low to high, that is, the little-endian order; or from left, from high to low, that is, the big-endian order. (Note: Fortunately, the byte median order is processed in a standard way.) Considering the Length Integer 123456787654321L, its 64-bit (in hexadecimal format) is represented as 0x0000704885F926B1. If we transmit this integer in the big-endian order, the byte decimal numeric sequence is as follows: order of transmission: If we transmit the order in little-endian order, the decimal array sequence of bytes is order of transmission. The key point of the transmission sequence is that for any multi-byte integer, the sender and receiver must reach a consensus on whether to use the big-endian sequence or the little-endian sequence []. If the sender uses the little-endian order to send the preceding integers, and the receiver receives the integers in the big-endian order, the receiver will get the wrong value, it will resolve the integer of the 8-byte sequence to listen 65164544669515776l. The last detail that the sender and receiver need to reach a consensus is: whether the transmitted value is signed or unsigned ). The four basic integers in Java are signed and their values are stored in the binary complement (two's-complement) method. This is a common expression of signed values. When processing a k-bit signed number, the negative integer-n (1 ≤ n ≤ 2 k? 1), then the binary value of the complement code is 2 k? N. For non-negative integer p (0 ≤ p ≤ 2 k? 1-1), but simply uses the k-bit binary number to represent the value of p. Therefore, for a given k-bit, we can use a binary complement to represent? 2 k? 1 to 2 k? 1? Value in the range of 1. Note that the highest bit (msb) indicates whether the number is positive (msb = 0) or negative (msb = 1 ). In addition, if unsigned encoding is used, k-bit can directly represent the value between 0 and 2 k-1. For example, if the 32-bit value is 0 xffffffff (all bits are 1) and it is parsed as a signed number, the binary complement integer represents-1. When it is parsed as an unsigned number, it indicates 4294967295. Java does not support unsigned integers. If you want to encode and decode the unsigned number in Java, you need to do some additional work. Assume that we are processing all signed integer data. So how can we store the correct message value into a byte array? To clearly demonstrate the steps that need to be done, we will introduce how to use "bit-diddling" (shift and shield) for explicit encoding. The sample program BruteForceCoding. java has a special method, encodeIntBigEndian (), which can encode the basic data types of any value. Its Parameters include the byte array used to store values and the values to be encoded (represented as long type, which is the longest integer and can save other integer values ), the offset of the starting position of the value in the byte array and the number of bytes written to the array. If encoding is performed on the sender, decoding must be performed on the receiver. The BruteForceCoding class also provides the decodeIntBigEndian () method to decode the subset of the byte array to a Java long integer. BruteForceCoding. java 0 public class BruteForceCoding {1 private static byte byteVal = 101; // one hundred and one 2 private static short shortVal = 10001; // ten thousand and one 3 private static int intVal = 100000001; // one hundred million and one 4 private static long longVal = 000000000001l; // one trillion and one 5 6 private final static int BSIZE = Byte. SIZE/Byte. SIZE; 7 private final Static int SSIZE = Short. SIZE/Byte. SIZE; 8 private final static int ISIZE = Integer. SIZE/Byte. SIZE; 9 private final static int LSIZE = Long. SIZE/Byte. SIZE; 10 11 private final static int BYTEMASK = 0xFF; // 8 bits 12 13 public static String byteArrayToDecimalString (byte [] bArray) {14 StringBuilder rtn = new StringBuilder (); 15 for (byte B: bArray) {16 rtn. append (B & BYTEMASK ). append (""); 17} 18 return rtn. toString (); 19} 20 21 // Warning: Untested preconditions (e.g ., 0 <= size <= 8) 22 public static int encodeIntBigEndian (byte [] dst, long val, int offset, int size) {23 for (int I = 0; I <size; I ++) {24 dst [offset ++] = (byte) (val >>( (size-I-1) * Byte. SIZE); 25} 26 return offset; 27} 28 29 // Warning: Untested preconditions (e.g ., 0 <= size <= 8) 30 public static lon G decodeIntBigEndian (byte [] val, int offset, int size) {31 long rtn = 0; 32 for (int I = 0; I <size; I ++) {33 rtn = (rtn <Byte. SIZE) | (long) val [offset + I] & BYTEMASK); 34} 35 return rtn; 36} 37 38 public static void main (String [] args) {39 byte [] message = new byte [BSIZE + SSIZE + ISIZE + LSIZE]; 40 // Encode the fields in the target byte array 41 int offset = encodeIntBigEndian (mesdian E, byteVal, 0, BSIZE); 42 offset = encodeIntBigEndian (message, shortVal, offset, SSIZE); 43 offset = encodeIntBigEndian (message, intVal, offset, ISIZE ); 44 encodeIntBigEndian (message, longVal, offset, LSIZE); 45 System. out. println ("Encoded message:" + byteArrayToDecimalString (message); 46 47 // Decode several fields 48 long value = decodeIntBigEndian (message, BSIZE, SSIZE); 49 System. out. println ("Decoded short =" + value); 50 value = decodeIntBigEndian (message, BSIZE + SSIZE + ISIZE, LSIZE); 51 System. out. println ("Decoded long =" + value); 52 53 // Demonstrate dangers of conversion 54 offset = 4; 55 value = decodeIntBigEndian (message, offset, BSIZE); 56 System. out. println ("Decoded value (offset" + offset + ", size" + BSIZE + ") =" 57 + value); 58 byte bVal = (byte) decodeIntBigEndia N (message, offset, BSIZE); 59 System. out. println ("Same value as byte =" + bVal); 60} 61 62} BruteForceCoding. java 1. data item encoding: Line 1-4 2. the number of bytes of the basic integer in Java: Row 6-9 3. byteArrayToDecimalString (): Row 13-19. This method prints each byte in the given array as an unsigned decimal number. BYTEMASK is used to prevent sign-extended (sign-extended) When byte values are converted to the int type. 4. encodeIntBigEndian (): To the right of the value assignment statement in Row 22-27, first move the value to the right so that the required bytes are in the lower 8 bits of the value. Then, convert the number after the shift to the byte type and store it to the appropriate location of the byte array. In the conversion process, all other bits except the Lower 8 bits are discarded. This process iterates based on the number of bytes occupied by a given number. This method also returns the new offset position in the byte array after the value is saved, so we do not have to do additional work to track the offset. 5. decodeIntBigEndian (): line 30-36 iterates Based on the byte size of the given array, and accumulates the acquired byte value into a long integer through the left shift operation of each iteration. 6. example method: lines 38-60 prepare to receive an array of integer sequences: lines 39th encode each item: lines 40-44 encode byte, short, int, and long integers, and store the data in the byte array in the sequence described above. Print the content of the encoded array: Row 3 decodes some fields in the encoded byte array: the output value after Decoding in line 47-51 is equal to the original value before encoding. Conversion problem: Row 53-59 is at the position of 4 in the byte array, and the decimal value of this byte is 245. However, when reading it as a signed byte, the value is-11 (the binary complement representation of the signed integer is recalled ). If we store the return value directly into a long integer, it simply becomes the last byte of the long integer with a value of 245. If the returned value is placed in a byte integer, the value is-11. Which value is correct depends on your application. If you want to get a signed value after decoding from N Bytes, you must store the decoding result (long result) into a basic integer that just occupies N Bytes. If you want to get an unsigned array, you must store the decoded result into a longer basic integer, which must take at least N + 1 bytes. Note: At the beginning of the encodeIntBigEndian () and decodeIntBigEndian () methods, we may need to perform some prerequisite checks, such as 0 ≤ size ≤ 8 and dst =null. Can you give examples of other pre-check tasks? Run the above program and its output shows the value of the byte (in decimal form): As you can see, the above mandatory (brute-force) encoding method requires the programmer to do a lot of work: calculate and name the offset and size of each value, and provide appropriate parameters for the encoding process. If the encodeIntBigEndian () method is not proposed as an independent method, the situation will be worse. For the above reasons, the forced encoding method is not recommended, and Java also provides some easy-to-use built-in mechanisms. However, it is worth noting that the forced encoding method also has its advantages. In addition to being able to encode standard Java integers, encodeIntegerBigEndian () the method applies to any integer ranging from 1 to 8 bytes-for example, if you want to, you can encode a 7-byte integer. A relatively simple method for constructing messages in this example is to use the DataOutputStream class and ByteArrayOutputStream class. The DataOutputStream class allows you to write basic data types, such as the preceding integer, into a stream: it provides writeByte (), writeShort (), writeInt (), and writeLong () methods, in the big-endian order, these methods write Integers to the stream in the form of binary complement codes of an appropriate size. ByteArrayOutputStream class obtains the byte sequence written to the stream and converts it into a byte array. The code for constructing our messages using these two classes is as follows: ByteArrayOutputStream buf = new ByteArrayOutputStream (); DataOutputStream out = new DataOutputStream (buf); out. writeByte (byteVal); out. writeShort (shortVal); out. writeInt (intVal); out. writeLong (longVal); out. flush (); byte [] msg = buf. toByteArray (); maybe you want to run this code to verify it with BruteForceEncoding. java outputs the same results. After talking about so many sender-related content, how will the receiver restore the transmitted data? As you want, Java also provides an input tool class similar to the output tool class, namely the DataInputStream class and ByteArrayInputStream class. We will use these two classes as an example when discussing how to parse incoming messages. Also, in chapter 5th, we will see another method, using the ByteBuffer class to convert the basic data type to a byte sequence. Finally, all the content in this section applies to the BigInteger class, which supports any large integer. For a basic integer, the sender and receiver must reach a consensus on how much space (number of bytes) is used to represent a value. However, this is in conflict with the use of BigInteger, because BigInteger can be of any size. One solution is to use a length-based frame, which we will see in section 3.3.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.