Anatomy of the Java serialization mechanism

Source: Internet
Author: User
Tags object serialization

Analysis of Java serialization algorithm

Serialization (serialization) is a process of describing objects in a sequence of bytes, and deserializing deserialization is a process of re-building these bytes into an object. The Java Serialization API provides a standard mechanism for handling object serialization. Here you can learn how to serialize an object, when to serialize and the Java serialization algorithm, and we use an instance to demonstrate how the serialized bytes describe the information of an object.

The necessity of serialization

In Java, everything is an object, and in a distributed environment it is often necessary to pass an object from one end of the network or device to the other. This requires a protocol that can transmit data at both ends. The Java serialization mechanism is created to solve this problem.

How to serialize an object

The premise that an object can be serialized is to implement the Serializable interface, the serializable interface has no method, more like a token. The class with this tag can be processed by the serialization mechanism.

  1. import java.io.Serializable;
  2. class Testserial implements Serializable {
  3. Public byte Version = +;
  4. Public byte count = 0;
  5. }

Then we write a program that serializes and outputs the object. The ObjectOutputStream can output an object as a byte stream. We temporarily store the byte stream in the Temp.out file.

  1. Public Static void Main (String args[]) throws IOException {
  2. FileOutputStream fos = new fileoutputstream ("Temp.out");
  3. ObjectOutputStream Oos = new objectoutputstream (FOS);
  4. Testserial ts = new testserial ();
  5. Oos.writeobject (TS);
  6. Oos.flush ();
  7. Oos.close ();
  8. }

If you want to read the bytes rebuild object from a persisted file, we can use ObjectInputStream.

  1. Public Static void Main (String args[]) throws IOException {
  2. FileInputStream FIS = new fileinputstream ("Temp.out");
  3. ObjectInputStream oin = new objectinputstream (FIS);
  4. Testserial ts = (testserial) oin.readobject ();
  5. System.out.println ("version="+ts.version);
  6. }

Execution results are

100.

Serialization format of the object

What does it look like after serializing an object? Open the Temp.out file that we just serialized the object as output, displayed in 16 binary mode. The content should read as follows:

AC ED (0A), 6C 6573 A0 0C 0563 B1 DD F9, 76, 6, 07, 6E,, 74 42 00 5 6F 6E 7870 00 64

this lump of bytes is used to describe the serialization Testserial Object, we note that there are only two domains in the Testserial class:

Public byte version = 100;

Public byte count = 0;

And both are byte types, which theoretically store only 2 bytes in both fields, but in fact temp.out occupy 51bytes, meaning that in addition to the data, other descriptions of the serialized object are included.

The serialization algorithm of Java

The serialization algorithm typically does the following steps as follows:

    1. Outputs the class metadata related to the object instance.
    2. Recursively outputs a superclass description of a class until there are no more super-classes.
    3. After the class metadata is finished, start outputting the actual data values of the object instances from the topmost superclass.
    4. Recursive output of data from top to bottom instances

We illustrate with another example that covers all possible situations more fully:

  1. class Parent implements Serializable {
  2.   int parentversion = ten;
  3. }
  4. class contain implements serializable{
  5. int containversion = one;
  6. }
  7. Public class Serialtest extends parent implements Serializable {
  8. int Version = n/A;
  9. Contain con = new contain ();
  10. Public int getversion () {
  11. return version;
  12. }
  13. Public Static void Main (String args[]) throws IOException {
  14. FileOutputStream fos = new fileoutputstream ("Temp.out");
  15. ObjectOutputStream Oos = new objectoutputstream (FOS);
  16. Serialtest st = New serialtest ();
  17. Oos.writeobject (ST);
  18. Oos.flush ();
  19. Oos.close ();
  20. }
  21. }

This example is quite straightforward. The Serialtest class implements the parent superclass and also holds a container object inside.

The serialized format is as follows:

AC ED 7372 0A, 6C 54 65

5A AC F6 02 00 0249 00 07

6F 6e4c00 6F 6E74 00 09

4c63 6F 6E 7872 00 06 70 61 72

6E 0E DB D2 BD-EE 02 00 0149 00

0D, 6E, 78, 6F, 6E, 70

0000000A 0000004273, 6F 6E 74

6E FC BB E6 0E FB CB C7 02 00 0149 00

0E 6F 6E 78----6E

700000000B

Let's take a closer look at these bytes, the beginning of the section:

    1. AC ed:stream_magic. The declaration uses a serialization protocol.
    2. XX 05:stream_version. Serialization protocol version.
    3. 0x73:tc_object. Declares that this is a new object.

The first step in the serialization algorithm is to describe the output object-related classes . The example shows the object as an instance of the Serialtest class, so the next output is the description of the Serialtest class:

    1. 0x72:tc_classdesc. Declare here a new class is started.
    2. The length of the 0a:class name of the XX.
    3. The 6c of the 74:serialtest,class class name.
    4. 5A AC F6:serialversionuid, serialization ID, if not specified, generates a 8byte ID randomly by the algorithm.
    5. 0X02: Tag number. This value declares that the object supports serialization.
    6. 00 02: The number of fields that the class contains.

Next, the algorithm outputs one of the fields , int version=66:

    1. 0x49: Domain type. 49 means "I", which is int.
    2. 00 07: The length of the domain name Word.
    3. 6F 6e:version, domain name word description.

The algorithm then outputs the next field , contain con = new contain (); This is a bit special, it's an object. A standard object signature notation that describes the use of the JVM for object type references is required:

    1. 0x4C: The type of the domain.
    2. 00 03: Domain name word length.
    3. 6F 6E: Domain name word description, con
    4. 0x74:tc_string. Represents a new string. References an object using string.
    5. 00 09: The string length.
    6. 4C 6F 6E 3b:lcontain (6E), standard object signature notation for the JVM.
    7. 0x78:tc_endblockdata, object data block end flag

The next algorithm will output the superclass , which is the parent class, describing:

    1. 0x72:tc_classdesc. Declares that this is a new class.
    2. 00 06: Class name length.
    3. 6E 74:parent, class name description.
    4. 0E DB D2 BD-EE 7a:serialversionuid, serialized ID.
    5. 0X02: Tag number. This value declares that the object supports serialization.
    6. 00 01: The number of fields in the class.

Next, output the domain description of the parent class , int parentversion=100:

    1. 0x49: Domain type. 49 means "I", which is int.
    2. 0 d: Domain name word length.
    3. The 6E, 6F 6e:parentversion, the domain name word description.
    4. 0x78:tc_endblockdata, the object block end of the flag.
    5. 0x70:tc_null, stating that there are no other super-class flags:

To this end, the algorithm has output for a description of the superclass, the description of the superclass, the description of the class, such as the description of the class, in addition to the description of all classes in the contain class (described in this class). The next step is to output the actual value of the instance object . This is started from the domain of the parent class :

    1. The value of the 0000000a:10,parentversion field.

There are also domains for the Serialtest class :

    1. 00000042:66, the value of the Version field.

The next bytes is more interesting, the algorithm needs to describe the information of the contain class, remember, there is no description of the contain class (the class of this class domain):

    1. 0x73:tc_object, declares that this is a new object.
    2. 0x72:tc_classdesc declares a new class to start here.
    3. 00 07: The length of the class name.
    4. 6F 6E 6e:contain, class name description.
    5. FC BB E6 0E FB CB c7:serialversionuid, serialized ID.
    6. 0x02:various flags. Flag. This value declares that the object supports serialization
    7. 00 01: The number of fields within the class.

The only domain description of the output contain , int containversion=11;

    1. 0x49: Domain type. 49 means "I", which is int.
    2. 0E: Domain name word length.
    3. 6F 6E---6E-------------------6e:containversion
    4. A flag that 0x78:tc_endblockdata the end of the object block.

At this point, the serialization algorithm checks to see if the contain has a superclass and then outputs if any.

    1. 0x70:tc_null, there's no super class.

Finally, the actual field value of the contain class is output .

    1. 0000000b:11, the value of containversion.

Summarize:

The order of the Java serialization:

Description of this class-the description of the class---class description of the superclass--the value of the superclass domain----the value of the class domain (bottom-up and top-down)

Where a domain is a reference to an object type, its description follows the standard object signature notation of the JVM, whose value contains the description of the class that refers to the object and the description of the domain (and, of course, if there is a superclass, recursion of the above procedure).

Reference: "The Java Serialization algorithm dialysis"--longdick's Blog

Anatomy of the Java serialization mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.