Hadoop detailed (10) serialization and writable interface

Source: Internet
Author: User
Tags serialization

Brief introduction

Serialization and deserialization are transformations between structured objects and byte streams, mainly used for communication and persistent storage of internal processes.

Communication Format Requirements

Hadoop's internal communication between nodes uses the RPC,RPC protocol to translate messages into binary byte streams to remote nodes, and the remote node then deserializes the binary into the original information by deserializing it. The serialization of RPC needs to implement the following points:

1. Compression, can play the effect of compression, the use of broadband resources to small.

2. Fast, internal processes build high speed links for distributed systems, so it must be fast between serialization and deserialization, and cannot make transmission speed a bottleneck.

3. Extensible, new server adds a parameter for the new client, so the old client can use it.

4. Good compatibility, can support the client of multiple languages

Storage format Requirements

On the surface it appears that the serialization framework may require some other features in the persistence of storage, but in fact it remains the four points:

1. Compressed, less space occupied

2. Fast, can read and write quickly

3. Scalable, old data can be read in old format

4. Good compatibility, can support reading and writing in multiple languages

Serialization format for Hadoop

The serialized storage format for Hadoop itself is the class that implements the writable interface, and he only implements the first two points, compression and speed. But it's not easy to expand and not to cross languages.

Let's take a look at the writable interface, and the writable interface defines two methods:

1. Write data to the binary stream

2. Reading data from binary data streams

Package Org.apache.hadoop.io;  
      
Public interface Writable {  
    void write (Java.io.DataOutput p1) throws java.io.IOException;  
      
    void ReadFields (Java.io.DataInput p1) throws java.io.IOException;  
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.