The writable and realization of Hadoop-2.4.1 learning

Source: Internet
Author: User

Hadoop implements a simple, efficient serialization protocol based on Datainput and DataOutput, while the writable interface defines the method of Hadoop serialization, and any key-value type in the MapReduce framework implements that interface, such as Intwritable , longwritable and so on, the detailed class relations see:


It can be found that the common key value types in MapReduce do not directly implement the writable interface, but implement the interface Writablecomparable, which also inherits the comparable interface. This means that the implementation class can be compared to each other in addition to serializing and sending the sequence, because when these classes are used as keys in mapreduce, they need to be compared in the sort stage. This is not to say that if you implement a custom serialization class, you must implement the Writablecomparable interface, which must be implemented only if the custom serialization class is also used as a key, and only if it is used as a value, only the writable interface can be implemented.

When a custom serialization class is used as a key, it is necessary to consider that the Hashcode () method is often needed when partitioning from a key, so it is necessary to ensure that the method returns the same results in different JVM instances, and that the default Hashcode () method in the object objects does not satisfy the attribute. Therefore, when implementing a custom class, you need to override the Hashcode () method, and if two objects are equal according to the Equals () method, the Hashcode () return value of both must be the same, so it is necessary to rewrite the equals when overriding the Hashcode () ( Object obj) method.

In addition to the classes that implement Writablecomparable, there are several classes that implement the writable interface directly, such as Objectwritable, a polymorphic writable that can handle arrays without writable encapsulation, Strings and other Java basic types. There are also writable collection classes: Arraywritable, enumsetwritable, mapwritable, Twodarraywritable, sortedmapwritable. Where arraywritable is the encapsulation of an array of the same writable type, that is, the type of writable in the class must be the same, intwritable are intwritable, Cannot have both intwritable and longwritable. Twodarraywritable is the encapsulation of a two-dimensional array that is a matrix, and the same type of writable must be in the same class. The enumsetwritable is a map<writable,writable> interface for the writable,mapwritable of the Enumset package, Sortedmapwritable implements SortedMap The <WritableComparable,Writable> interface, of course, also implements the writable interface, in the internal implementation of both, using the byte type indicates the specified type, so there can be up to 127 different classes in a map instance:

/* Class to ID mappings */  @VisibleForTesting  map<class, byte> classtoidmap = new concurrenthashmap< Class, byte> ();  /* Id to Class mappings */  @VisibleForTesting  map<byte, class> idtoclassmap = new concurrenthashmap< Byte, class> ();

Now by analyzing the source code of Intwritable and text to learn how to write writable and writablecomparable, first of all intwritable source code:

public class Intwritable implements writablecomparable<intwritable> {private int value;  Public intwritable () {} public intwritable (int value) {set (value);} /** Set The value of this intwritable.  */public void set (int value) {this.value = value;} /** Return The value of this intwritable.  */public int get () {return value;} @Override//Rewrite ReadFields (datainput in) public void ReadFields (Datainput in) writable throws IOException {value = In.rea  DInt (); } @Override//Overrides write (DataOutput out) in writable public void write (DataOutput out) throws IOException {Out.writeint (  Value); }/** Returns True if <code>o</code> is a intwritable with the same value. */@Override public boolean equals (Object o) {if (! (    o instanceof intwritable)) return false;    intwritable other = (intwritable) o;  return this.value = = Other.value;  } @Override public int hashcode () {return value; }/** compares and intwritables. */@Override//override in the comparable interfaceCompareTo method public int compareTo (intwritable o) {int thisvalue = This.value;    int thatvalue = O.value;  Return (Thisvalue<thatvalue-1: (Thisvalue==thatvalue 0:1));  } @Override Public String toString () {return integer.tostring (value);     }//The internal class inheriting from Writablecomparator is omitted here comparator static {//register this comparator  Writablecomparator.define (Intwritable.class, New Comparator ()); }}

      intwritable Source code is relatively simple, in addition to implementing the methods in the interface, but also rewrite the hashcode, Equals and ToString method, this is also a point to note. followed by the text class, text stores the string as standard UTF8 encoding, providing methods for serializing, deserializing, and comparing strings at the byte level, such as Decode (Byte[]utf8), encode (string string), ReadFields ( Datainput in), write (DataOutput out), and so on. In addition to implementing Writablecomparable, the class inherits from the Binarycomparable abstract class, which implements the following method:

Private byte[] bytes;private int length; @Overridepublic void ReadFields (Datainput in) throws IOException {    // Read integer value from input stream, more tool method can refer to Writableutils tool class    int newlength = Writableutils.readvint (in); Setcapacity (Newlength, False) ;//Read into bytes The data    in.readfully (bytes, 0, newlength) of length newlength;    length = Newlength;  } @Overridepublic void Write (DataOutput out) throws IOException {    writableutils.writevint (out, length);    Out.write (bytes, 0, length);  } @Overridepublic int CompareTo (binarycomparable) {    if (this = other)      return 0;    Return Writablecomparator.comparebytes (getBytes (), 0, GetLength (),             other.getbytes (), 0, Other.getlength ());  }

Summarizing the implementation of the Intwritable and text classes, you can implement custom writablecomparable, and here is a simple example. Using name and age as Union keys in the example is considered an object only if both are the same.

public class Compositewritable implements writablecomparable<compositewritable>{private String name;private int    Age;public compositewritable () {} public compositewritable (String name, int.) {Set (name, age);} @Overridepublic void ReadFields (Datainput in) throws IOException {name = In.readutf (), age = In.readint ();} @Overridepublic void Write (DataOutput out) throws IOException {Out.writeutf (name); Out.writeint (age);} @Overridepublic int compareTo (compositewritable o) {int cmp = Name.compareto (O.getname ()); if (cmp! = 0) return Cmp;return a GE < O.getage ()? -1: (age = = O.getage ()? 0:1);} @Overridepublic boolean equals (Object o) {if (o instanceof compositewritable) {compositewritable other = ( compositewritable) O;return this.name.equals (other.name) && this.age = = other.age;} return false;} @Overridepublic int hashcode () {return Name.hashcode () + age;} @Overridepublic String toString () {return name + "\ T" + age;} public void Set (String name, int.) {this.name = Name;this.age = Age;} Public String GetName () {return this.name;} public int getage () {return this.age;}}

Writable of Hadoop-2.4.1 learning and its implementation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.