Stream-based data read and write, too abstract, what is a stream-based, what is a stream? Hadoop is written in the Java language, so to understand Hadoop's streaming Data Access, you have to start with the Java streaming mechanism. Streaming is also an important mechanism in Java and C + +, which allows us to freely manipulate data including files, memory, IO devices, and so on.
first, what is a flow?
Flow is an abstract concept, is the abstraction of input, in Java programs, the input/output operation of data is carried out in a "stream" manner. The device can be file, network, memory and so on.
Flow has directionality, as to whether it is an input stream or an output stream is a relative concept, generally referred to the program, if the flow of data is the program to the device, we become the output stream, which we call the input stream.
The flow can be imagined as a "water pipe", the flow is formed in the pipeline, naturally there is the concept of direction.
When a program needs to read data from a data source, it opens an input stream that can be a file, memory, or network, and so on. Conversely, when you need to write data to a destination of a data source, you also open an output stream, which can also be a file, memory, or network, and so on.
What are the categories of streams?
Can be classified from different angles of convection:
1. Different data units can be processed into: character stream, byte stream
2. Data flow direction is different, can be divided into: input stream, output stream
3. Different functions, can be divided into: node flow, processing flow
1. and 2. Are better understood, for the classification according to the function, you can understand:
node stream : A node stream reads and writes data from a specific data source. That is, a node stream is a stream of directly manipulating files, networks, and so on, such as FileInputStream and FileOutputStream, that they read directly from a file or write to a file.
Process Flow : "Connect" provides a more powerful read and write function for a program by processing the data on top of an existing stream (either a node stream or a processing stream). The filter flow is created using an existing input stream or output stream connection, which is a series of wrappers for the node stream. For example, Bufferedinputstream and Bufferedoutputstream, which are constructed using existing node streams, provide buffered reads and writes, improve read and write efficiency, and DataInputStream and DataOutputStream, Constructed using a node stream that already exists, providing the ability to read and write basic data types in Java. They all belong to the filter stream.
To give a simple example:
public static void Main (string[] args) throws IOException { //node stream FileOutputStream directly with A.txt as the data source operation FileOutputStream FileOutputStream = new FileOutputStream ("A.txt"); The filter stream Bufferedoutputstream further decorates the node stream, providing buffer write bufferedoutputstream bufferedoutputstream = new Bufferedoutputstream ( fileoutputstream); The filter stream DataOutputStream further decorates the filtered stream so that it provides a basic data type of write dataoutputstream out = new DataOutputStream (bufferedoutputstream); Out.writeint (3); Out.writeboolean (true); Out.flush (); Out.close (); The stream of nodes is entered here, and the filter stream corresponds exactly to the upper output, and the reader can extrapolate datainputstream in = new DataInputStream (New Bufferedinputstream new FileInputStream ("A.txt")); System.out.println (In.readint ()); System.out.println (In.readboolean ()); In.close ();}
Flow Structure Introduction :
All Java stream classes are located in the Java.io package, each inheriting the following four abstract stream types.
|
BYTE stream |
Character Stream |
Input stream |
InputStream |
Reader |
Output stream |
OutputStream |
Writer |
1. Streams that inherit from Inputstream/outputstream are used to input/output data to the program, and the units of the data are bytes (byte=8bit), the darker ones are node streams, and the light is the processing stream.
2. Streams that inherit from Reader/writer are used to input/output data to the program, and the units of the data are the characters (2byte=16bit), the darker is the node stream, and the light is the processing stream.
Common Flow class Description:
Common types of node flows are:
The character stream of the file operation has Filereader/filewriter, and the byte stream has fileinputstream/fileoutputstream.
Common to process flow types are:
Buffered stream: Buffer stream to "socket" on the corresponding node stream, read and write data to provide buffering function, improve the reading and writing efficiency, colleagues added some new methods.
Byte buffer stream has bufferedinputstream/bufferedoutputstream, character Buffer stream has bufferedreader/bufferedwriter, The character buffer stream provides methods for reading and writing a row, respectively, ReadLine and newline methods.
For the output buffer stream, the data is written to memory first, and then the Flush method is used to brush the data in memory to the hard disk. Therefore, when using the character buffer stream, be sure to flush first, and then close to avoid data loss.
Conversion flow: Used for conversion between byte data and character data.
only character Stream inputstreamreader/outputstreamwriter. Among them, InputStreamReader need with InputStream "socket", OutputStreamWriter need with OutputStream "socket".
Data flow: Provides the ability to read and write basic data types in Java.
DataInputStream and DataOutputStream inherit from InputStream and OutputStream, respectively, and require "sockets" on the InputStream and OutputStream types of node streams.
Object flow: Used to write an object directly to write out.
The Stream class has ObjectInputStream and ObjectOutputStream, itself these two methods are nothing, but the object to be written is required, the object must implement the Serializable interface, to declare that it can be serialized. Otherwise, the object stream cannot be read and written.
There is one more important keyword, transient, because the adornment implements the properties within the class of the serializable interface, the property that is decorated by the modifier is ignored when the object is streamed out.
"Go" input/output stream-in-depth understanding of streams (stream) in Java