The storage difference between byte stream and byte stream in Java

Source: Internet
Author: User

Differences between byte stream and byte stream storage in Java, using several common types of data to compare the differences between byte stream and byte stream
Int A = 5;
Boolean B = true;
Char c = 'G ';
String d = "hello ";
 
Print the data of the above types to the file using the ghost stream:
Printwriter dos = new printwriter (New bufferedwriter (New filewriter ("C: // buffertest.txt ")));
Dos. Print ();
Dos. Print (B );
Dos. Print (C );
Dos. Print (d );
 
The result is as follows:
A is 5
B is true
C is g
D. Hello.
The character stream is completely consistent with the characters we entered.
 
Let's look at the byte stream.
Dataoutputstream dos = new dataoutputstream (New fileoutputstream ("C: // streamtest.txt "));
Dos. writeint ();
Dos. writeboolean (B );
Dos. writechar (C );
Dos. writeutf (d );
Dos. writechars (d );
Dos. writebytes (d );
 
The result is a binary file. Open it in the hexadecimal editor.
A is 00 00 00 05, Int Is four bytes
B is 01, and a Boolean variable is a byte.
C is 00 47, char is two bytes
D. Print three different functions in the file respectively.
The first one is 00 06 E4 BD A0 E5 A5 Bd. The first 00 06 is the addition of writeutf, which is the number of bytes. The next six bytes are the UTF Encoding of "hello, 3 bytes for each Chinese Character
The second one is 4f 60 59 7d. This is the Unicode code of "hello" Big endian. Each Chinese character contains 2 bytes.
The third is 60 7d, which is the low byte of two Chinese characters obtained from 4f 60 59 7d respectively.
 
Further description
Use NotePad to save different encoding files. The file header has some tags to identify the encoding type of the files. use NotePad to save the files of different encoding types, the encoding can be correctly recognized when you open it in Notepad. If you open the encoding in a hexadecimal editor, you will see that the mark used to identify the encoding type is written in the file header. The types are described as follows:
Ef bb bf UTF-8
FF Fe UTF-16/UCS-2, little endian
Fe FF UTF-16/UCS-2, big endian
FF Fe 00 00 UTF-32/UCS-4, little endian.
00 00 Fe FF UTF-32/UCS-4, big-Endian.
 
When the UTF-8 holds a character, it is 1-3 bytes in length, that is, 8bit-24bit.
The Code <= 007f is saved as 1 byte.
(Code> = 0080) & (Code <= 0x07ff), saved as 2 bytes
Code> 0800, saved as 3 bytes

The gb2312 encoding for "hello" is C4 E3 Ba C3, with more than 0800 Chinese characters. Therefore, each Chinese character is saved as 3 bytes.
 
Little endian: Low-address storage, low-byte storage, x86 is in this Order
Big endian: the low address stores high bytes, and the network byte order is in this order.

This article from: It Knowledge Network (http://www.itwis.com) detailed source reference: http://www.itwis.com/html/java/j2se/20080428/1367.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.