Hadoop detailed (11) serialization and writable implementation

Source: Internet
Author: User
Tags assert serialization

Brief introduction

In Hadoop, the implementation class of the writable is a huge family, and we are here to briefly describe some of the parts that are often used for serialization.

Java Native type

Except for the char type, all native types have corresponding writable classes, and their values are available through get and set methods.

Intwritable and longwritable also have corresponding variable-length vintwritable and vlongwritable classes.

Fixed length or longer selection similar to the database with char or VCHAR, here is not to repeat.

Text type

The text type uses a variable length int type storage length, so the maximum storage for the text type is 2G.

The text type uses standard UTF-8 encoding, so it can be very good to interact with other text tools, but note that this is a lot different from the Java string type.

Retrieval of different

The Chatat of text returns an integral type and a utf-8 encoded number, rather than a Unicode-encoded char type like string.

@Test public
void Testtextindex () {  
    text text=new text ("Hadoop");  
    Assert.assertequals (Text.getlength (), 6);  
    Assert.assertequals (Text.getbytes (). length, 6);  
    Assert.assertequals (Text.charat (2), (int) ' d ');  
    Assert.assertequals ("Out of Bounds", Text.charat (), -1);  
}

Text also has a Find method, similar to the IndexOf method in string

@Test public
void Testtextfind () {  
    text text = new text ("Hadoop");  
    Assert.assertequals ("Find a substring", Text.find ("Do"), 2);  
    Assert.assertequals ("Find a ' o '", Text.find ("O"), 3);  
    Assert.assertequals ("Find ' o ' position 4 or later", Text.find ("O", 4), 4);  
    Assert.assertequals ("No match", Text.find ("Pig"), -1);  
}

The different Unicode

When the uft-8 encoded bytes are greater than two, the difference between text and string is clearer, because string is computed in Unicode char, and text is calculated in bytes.

Let's look at 1 to 4 bytes of different Unicode characters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.