Core Science Garden: A brief analysis of Java byte stream and character stream

Source: Internet
Author: User

1. What is a stream

The stream in Java is an abstraction of a sequence of bytes, and we can imagine a water pipe, except that it is no longer the water that flows in the pipe, but the sequence of bytes. Like currents, streams in Java also have a "flow direction," where an object that is typically read into a sequence of bytes is called an input stream, and an object to which a sequence of bytes can be written is called the output stream.

2. Byte stream

The most basic unit of byte stream processing in Java is single bytes, which is typically used to process binary data. The most basic two byte stream classes in Java are InputStream and OutputStream, which represent the group's basic input byte stream and output byte stream, respectively. Both the InputStream class and the OutputStream class are abstract classes, and we typically use a series of subclasses that are provided in the Java class Library in real-world use. Let's take the InputStream class as an example to introduce the byte stream in Java.

The InputStream class defines a basic method for reading bytes from a byte stream read, which is defined as follows:

public abstract int Read () throws IOException;

This is an abstract method, which means that any input byte stream derived from InputStream needs to implement this method, the function of which is to read a byte from the stream, or 1 if it returns to the end, otherwise the bytes read in. What we need to note about this method is that it will always be blocked knowing that it returns a read to the byte or-1.

In addition, byte streams do not support caching by default, which means that each call to the Read method requests the operating system to read a byte, which is often accompanied by a disk IO, and is therefore less efficient. Some small partners may think that the InputStream class reads an overloaded method with a byte array as a parameter, capable of reading multiple bytes at a time without frequent disk IO.

So is this the case? Let's take a look at the source code for this method:


public int read (byte b[]) throws IOException {

Return read (b, 0, b.length);

}

It calls another version of the read overloaded method, and then we go down:

public int read (byte b[], int off, int len) throws IOException {

if (b = = null) {

throw new NullPointerException ();

} else if (Off < 0 | | Len < 0 | | len > B.length-off) {

throw new Indexoutofboundsexception ();

} else if (len = = 0) {

return 0;

}

int c = Read ();

if (c = =-1) {

return-1;

}

B[off] = (byte) c;

int i = 1;

try {

for (; i < Len; i++) {

c = Read ();

if (c = =-1) {

Break

}

B[off + i] = (byte) c;

}

} catch (IOException ee) {

}

return i;

}

From the above code we can see that in fact the read (byte[]) method inside is also through the loop call the Read () method to implement "Once" read into a byte array, so essentially this method also does not use the memory buffer. To use memory buffers to improve the efficiency of reading, we should use Bufferedinputstream.

3. Character Stream

The most basic unit of character stream processing in Java is the Unicode code element (size 2 bytes), which is typically used to process text data. The so-called Unicode code element, which is a Unicode unit, is scoped to 0X0000~0XFFFF. Each number in the above range corresponds to a word typeface, and the string type in Java defaults to encoding the character in Unicode and storing it in memory.

However, unlike storage in memory, data stored on disk is often encoded in a variety of ways. Using different encodings, the same characters will have different binary representations.

Actually, the character stream works like this:

output character stream: The sequence of characters to be written to the file (actually a sequence of Unicode symbols) into a sequence of bytes under the specified encoding and then written to the file;

input character stream: decodes a sequence of bytes to be read into a corresponding sequence of characters (actually a sequence of Unicode symbols from) in memory, as specified.

We use a demo to deepen our understanding of this process, the sample code is as follows:

Import Java.io.FileWriter;

Import java.io.IOException;

public class Filewriterdemo {

public static void Main (string[] args) {

FileWriter FileWriter = null;

try {

try {

FileWriter = new FileWriter ("Demo.txt");

Filewriter.write ("demo");

} finally {

Filewriter.close ();

}

} catch (IOException e) {

E.printstacktrace ();

}

}

}

In the above code, we use FileWriter to write the "demo" four characters to demo.txt, and we use the Hex editor winhex to view the contents of Demo.txt:

As can be seen, the "demo" We write is encoded to "6D 6F", but we do not explicitly specify the encoding in the above code, in fact, we do not specify when we use the operating system's default character encoding to encode the characters we are writing.

Since the character stream is actually going to complete the conversion of the Unicode sequence of bytes to the corresponding encoding sequence before the output, it uses a memory buffer to hold the converted sequence of bytes, and waits for the conversion to be completed and written to the disk file.

4. The difference between a character stream and a byte stream

After the above description, we can know that the main difference between the byte stream and the character stream is reflected in the following aspects:

The base unit for a byte stream operation is bytes, and the base unit for a character stream operation is a Unicode code element.

BYTE streams do not use buffers by default, and character flows use buffers.

A byte stream is typically used to process binary data, and in fact it can handle any type of data, but it does not support directly writing or reading Unicode code elements; Character flow often processes text data, which supports writing and reading Unicode code elements.

Core Science Garden: A brief analysis of Java byte stream and character stream

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.