[Crazy Java] The highest state of the I/O:I/O stream-object Flow (serialization: Manual serialization, automatic serialization, reference serialization, version)

Source: Internet
Author: User
Tags class definition first string object serialization serialization

1. What is Object flow: The concept of serialization/deserialization

1) The object flow is in a conceptual system with the byte stream/character stream:

A. So the word stream is a stream of byte sequences, the character stream is a flowing sequence of characters, then the object flow is the flow of the object sequence?

B. Conceptually it can be understood that the object flow is specifically used to transfer Java objects;

C. But bytes and characters are very intuitive binary code (the byte itself is, and the character is a binary encoding), the flow of binary code is in line with the conceptual model of the computer, but the object is an abstract thing, how can the object flow like binary code?

D. In fact, it is very well understood that object flow is just Java API, on the surface (method calls, etc.) flow is the object, and in fact, the bottom is definitely converted to binary code flow;

E. Specifically, the underlying is to convert the object into platform-independent Java stream to propagate, the class name of the object flow is ObjectInputStream and ObjectOutputStream, stream as the suffix is bound to pass the byte stream;

F. Instead of converting an object into a byte array when invoking the object stream's read and write series methods, you can pass in the object itself, which automatically translates the object into a stream of bytes!!

2) Why object flow is required:

I. First, Java itself is an object-oriented language, so objects are more widely used than bytes/characters;

II. Pass the text of course use a character stream, so the use of the character stream is very wide, that is, the text, string processing, preservation and other applications are very wide, it is no doubt, and directly pass the byte of the application (image, audio, binary files, etc.) may be very broad, but Java is not only used to deal with these two Java really faces the most or the object;

III. Programs often need to pass Java objects (that is, input and output of Java objects) between storage nodes, either by using a byte stream or by a character stream to pass through the traditional method:

A. First, the character stream can be excluded, because the object may contain both textual data (string, etc.), may also contain non-text class data (such as bytes, images, etc.), if the image of such data also converted to characters, it is obviously not feasible;

B. It is only possible to use byte stream, but the use of bytes is cumbersome, you need to manually convert all the non-byte data in the object into byte data, and then all the members of the converted bytes into a byte array to write (output), and the input must first be read with an entire byte array, Then the array is parsed, and finally the original object is restored;

!! This seems to have been difficult to go to the sky, no one is willing to do the object of I/O;

3) So Java provides object flow (ObjectInputStream, ObjectOutputStream)--to automatically serialize objects and transfer objects:

I. First understand the concept of serialization (Serialize):

A. The serialization of C + +;

B. That is, regardless of the data passed (bytes, characters, objects), at the bottom must be a binary byte stream, the computer can only recognize the binary code, so the byte is not to mention that the character must also be converted to binary encoding, the same object must be converted to a binary sequence of bytes in order to transmit;

C. Serialization means the process of converting data in the original program (abstract data, such as characters, objects, etc.) into a binary byte sequence;

D. Serialization is the precondition of data transmission;

!! Deserialization is to read and restore the serialized data already stored on the storage node to the original Java Object!

II. The object stream of Java can first automatically serialize objects:

Output: ObjectOutputStream

A. When an object is output with an object stream, the members of the object are automatically parsed first;

B. Then automatically serialize the individual members into byte arrays;

C. The array is then stitched into a complete byte array in the order in which the members are defined, and the array is then passed;

Input: ObjectInputStream

A. Of course the output of the time ObjectOutputStream can be set to the object itself and each object member to do a certain identity;

B. The identity is actually the Java type of the data (you must know the Java type when you restore the object), and the size of the object (you must know how many bytes to read in order to finish reading the object);

C. The object input stream can be used to restore Java objects from six through the complete information;

D. The bottom layer is to first save the complete serialized object in a byte array, then parse the array based on the information and restore a complete Java object;

E. Those identities are actually the protocols of serialization and deserialization, and must obey the protocol in order to be able to deserialize correctly after serialization;


2. The approximate process of using object flow to input and output:

1) already know that the object flow is ObjectOutputStream and ObjectInputStream, and now introduce the approximate process of using them;

2) First I/O is a destination, that is, where you want to flow from, where two points is certain, that is the current program, then you must specify another point;

3) So the first step is to determine the storage node, so ObjectOutputStream and ObjectInputStream is a high-level processing flow, it is necessary to wrap a specific node flow to the line;

4) The two constructors:

I. ObjectInputStream (InputStream in);

II. ObjectOutputStream (OutputStream out);

!! They can be initialized with any node stream;

5) Then the read and write series methods using the object stream are read and written:

I. The read-write series method of object flow is very simple, does not involve byte, byte array, etc.

II. READ series: Xxx objectinputstream.readxxx ();

!! After read, you need to cast a conversion based on the actual type

III. Write series: void Objectoutputstream.writexxx (Xxx val);

Iv. XXX covers Almost all of Java's underlying types (byte, int, char, Boolean, double, and so on), the most important of which is that Object,readobject and writeobject are key to the input and output Java objects;

!! Note that there is no string, because string is not the underlying type, string is a class, so read and write string directly with ReadObject, WriteObject can!!

!! In other words, the object flow can not only input and output Java objects, but also can input and output ordinary data, the powerful visible function!

!! Now that you can input and output Java objects, why do you also provide versions of normal data types, such as Byte, int, double, and so on? This is provided in order to enable custom serialization!


3. Custom serialization: The serializable interface must be implemented to serialize

1) Instead of simply invoking the object stream, read and write will be able to casually input and output an object, provided that the object is serializable/deserialized!

2) The object flow must know how to serialize and deserialize the object, in order to correctly input and output the object;

3) Serializable interface:

I. The interface must be implemented to automatically serialize and deserialize;

II. The interface has two methods to implement, the inverse corresponds to how to serialize and deserialize:

A. Serialization algorithm implementation: private void WriteObject (ObjectOutputStream out) throws IOException;

B. Deserialization algorithm implementation: Private Object ReadObject (ObjectInputStream in) throws IOException, ClassNotFoundException;

!! You can see that the deserialization is not constructed with a constructor, but instead generates a lump of untyped data directly from the input stream and then transforms it with a forced type conversion! As a result, you can see that the method throws ClassNotFoundException, and throws the exception if the corresponding type is not prepared.

!! Object flow input and output the underlying is actually called:

A. ObjectOutputStream oos:oos.writeObject (obj), Obj.wirteobject (Oos)

B. ObjectInputStream ois:ois.readObject (obj), Obj.readobject (OIS)

!! It can be seen that the implementation of the algorithm is the use of obj obtained by the Oos and ois for input and output;

4) After all, some reading and writing is not the original mode of reading and writing, such as passwords, such as information, in the output often need to encrypt the output, so read the time also to decrypt the read, such as the case must be defined by their own serialization and deserialization algorithm;

5) The vast majority of Java basic classes, such as String, date and so on have implemented the serializable interface, so you can directly use the object flow of ReadObject and writeobject read and write;

6) All the underlying types (int, double, Boolean, etc.) object streams also provide the corresponding readxxx and writexxx for serialization and deserialization, so there is no need to worry;

7) So most custom types of objects are going to implement their own serialization and deserialization algorithms, and the underlying type of object flow input and output provided above is intended for customization, such as the following:

Class Member implements Serializable {string Name;int age;public Member (string name, int age) {this.name = Name;this.age = Age;} Public String GetName () {return name;} public void SetName (String name) {this.name = name;} public int getage () {return age;} public void Setage (int.) {this.age = age;} @Overridepublic String toString () {//TODO auto-generated method Stubreturn name + "(" + Age + ")";} private void WriteObject (ObjectOutputStream out) throws IOException {//output before encryption (serialization algorithm) Out.writeobject (new StringBuffer (name). reverse ()); Name reverse-order Encryption Out.writeint ((age << 4) + 13); Age left 4-bit plus 13 encryption}private void ReadObject (ObjectInputStream in) throws IOException, ClassNotFoundException {// Input decryption (deserialization algorithm, which is the inverse process of serialization) name = ((StringBuffer) In.readobject ()). Reverse (). ToString (); age = (In.readint ()->> 4 ;}} public class Test implements Serializable {public static void print (String s) {System.out.println (s);} public static void Main (string[] args) throws IOException, ClassNotFoundException {try (ObjectOutputStream oos = new ObjectOutputStream (New FileOutputStream ("OUT.txt"))) {Member m = new Member ("Lalala"); Oos.writeobject (m); Oos.close ();} Try (objectinputstream ois = new ObjectInputStream (New FileInputStream ("OUT.txt"))) {Member m = (Member) ois.readobject () ;p rint ("" + M);}}}

!! Can see, although the result of serialization is binary, open will see garbled, but the English character part or Unicode encoding, you can see the reverse of the "Alalal";

8) Considerations and specifications for implementing serialization and deserialization algorithms:

I. Generally write the serialization of The thick deserialization, thought that serialization is a coding process, and deserialization is a decoding process, the general coding logic takes precedence over decoding, decoding is the inverse process of coding, but usually does not say that coding is the inverse process of decoding;

!! Simply, the deserialization is to be written in the context of serialization;

II. Deserialization to be in the order of serialization, such as serialization is the order of the first string member and then the INT member, then deserialization is also the first string after int, because serialization is a sequential structure;


4. Automatic serialization--recursive serialization (in fact serializable is just a token interface):

1) The custom serialization described earlier is manual serialization, and its manual body now requires its own manual implementation of the serialization and deserialization algorithm;

2) and in fact, if all the members of your object have implemented the Serializable interface (which is serializable), then the object does not implement serialization and deserialization algorithms can also be automatically serialized;

3) For example: Object A contains member object B, and B contains member object C,c, which contains member objects D ..., if B, C, D ... have already implemented the serialization algorithm, then a can be automatically serialized without implementing its own serialization algorithm, when a is serialized as follows: D.writeobject, C.writeobject, B.writeobject, a .... Where a, B.writeobject, indicates that a is automatically called B.writeobject to serialize B when serialized, and B.writeobject C.writeobject refers to the B.writeobject method called the C.writeobject, and so on, that is, a layer of automatic recursive calls to the inner layer of the object writeobject serialization, such a call is automatic;

! Specific example: A A (string B, b C (Date D, string e)) object, where a object of type a contains member B (string type) and member C (Type B), and member C contains member D (date type) and member E (String type). If B has implemented a serialization/deserialization algorithm, serialization of a can be automated without implementing A's own serialization/deserialization algorithm, The automatic serialization method is: Call B.writeobject, then call C.writeobject, and if B does not implement its own serialization algorithm (no implementation of the WriteObject method of B) is OK, because its members D and E are also serializable, will be automatically called in the B.writeobject D.writeobject and E.writeobject;

4) Why is it possible to achieve serialization instead of implementing its interface method?

I. Serializable is actually a markup interface, where the serialization algorithm and deserialization algorithm can be used without implementation, the interface is just a token, indicating that the class is a serializable word!

II. Let a class serializable actually just give it a mark on the line! Even if the two algorithms are not implemented, they can be serialized as well!

5) Strict definition of automatic serialization:

I. is to give only a serializable tag, but does not implement serialization and deserialization algorithm, it means to use the automatic serialization function;

II. If you implement the serialization/deserialization algorithm, you will invoke the algorithm that you implement when serializing/deserializing.

III. If you do not implement a serialization algorithm, recursive invocation of the next layer of the member object's serialization algorithm (that is, if the next layer of member objects do not implement their own serialization algorithm will automatically invoke the lower layer of the serialization algorithm);

Iv. therefore automatic serialization is also known as recursive serialization;

6) The premise of recursive serialization: since recursive serialization (automatic serialization) is not a serialization of its own implementation of the serialization algorithm, then its requirement is naturally that all its members must be serializable!

!! It's good to understand that if a member is not serializable (without a serializable tag), how does it call its Writeobject/readobject method? Inevitably throws an exception (although the compilation will not have an exception!) );

!! Generally, if the class contains all Java's underlying class or underlying type data, it will generally take automatic serialization, or the member object (custom type) has already implemented its own serialization algorithm and will generally use automatic serialization;

! Unless there is a special need, such as to encrypt an object, in this case it is necessary to use manual serialization (implementing the serialization algorithm yourself);

* * Summary: If a member of an object is not serializable, then no matter if the object has no serializable tag is not serializable, if forced to do so will throw an exception!

7) Example of automatic serialization: It's simple, just need a serializable tag

Class Member implements Serializable {string name;//Serializable, Java has implemented int age for you,//the underlying type is also a serializable public Member (String name, I NT age) {this.name = Name;this.age = age;} @Overridepublic String toString () {//TODO auto-generated method Stubreturn name + "(" + Age + ")";}} public class Test implements Serializable {public static void print (String s) {System.out.println (s);} public static void Main (string[] args) throws IOException, ClassNotFoundException {try (objectoutputstream Oos = new Objec Toutputstream (New FileOutputStream ("OUT.txt"))) {Member m = new Member ("Lalala", N); Oos.writeobject (m); Oos.close ();} Try (objectinputstream ois = new ObjectInputStream (New FileInputStream ("OUT.txt"))) {Member m = (Member) ois.readobject () ;p rint ("" + M);}}}
!! Because serializable is just a markup interface, you can't find the two serialization algorithms when you use Eclipse's Auto override function, so the method interfaces of the two algorithms must all be backed up (thrown exceptions, etc.);

5. Java serialization mechanism--reference serialization number:

1) Consider the following situations:

Son son = new son ("Tom", 15); Parent father = new parent (son, 40); Parent mother = new parent (son, 39);
! That is, two objects hold the same member object, here Father.son = = Mother.son (the address is exactly the same), this association application in Java (especially in database applications) is particularly widely used;

!! Now if serializing son, father, mother, will it serialize 3 times son? If this is the case, then the deserialization is not the time to get three son? So the result of this deserialization is that father, mother hold a different son (address is different, three completely not the same memory space), this is not the original intention of relevance?

! But fortunately, Java serialization is not the case, it can be intelligent to identify this kind of holding the same object, and ensure that only a single public holding object;

2) Java serialization Mechanism--reference serialization number:

I. In the use of writeobject is actually a reference (reference is actually a pointer, and the value of the pointer is the memory address of the object);

II. At serialization, a serialization number is assigned to each incoming reference to the object to be serialized (that is, a serialization number is mapped for each memory address of the object to be serialized);

III. Before serializing, it checks that the object corresponding to the number has been serialized, and if it is serialized out, it is not serialized but only the serialized number of the object is written, and if not, the content of the object is serialized and written with its number;

Iv. that is, whether or not the serialization is bound to write the number, if not previously serialized before serialization, if the serialization is not serialized (but still write the number);

3) For the example above, the result of serialization is: son's number 1 ("Tom", 15), Father number 2 (ref. 1, 40), Mother number 3 (ref. 1, 39)

4) deserialization is the same, will be based on the number to determine the object, each number is only one object, to ensure that the same number of restoration when restored to the same object, will not repeat;

5) Example:

Class Son implements Serializable {string Name;int age;public Son (string name, int age) {this.name = Name;this.age = age;} }class Parent implements Serializable {son son; String Name;int age;public Parent (son son, String name, int age) {This.son = Son;this.name = Name;this.age = Age;}} public class Test {public static void print (String s) {System.out.println (s);} public static void Main (string[] args) throws IOException, ClassNotFoundException {try (objectoutputstream Oos = new Objec Toutputstream (New FileOutputStream ("Test.buf")) {son son = new Son ("Tom", 15); Parent father = new parent (son, "Peter", 40); Parent mother = new parent (son, "Mary", "Oos.writeobject"); Oos.writeobject (father); Oos.writeobject (mother); O Os.close ();} Try (objectinputstream ois = new ObjectInputStream (New FileInputStream ("Test.buf"))) {son son = (son) ois.readobject (); Parent father = (parent) ois.readobject (); Parent mother = (parent) ois.readobject ();p rint ("" + (son = = Father.son)); The answer is Trueprint ("" + (Father.son = = Mother.son));}}} 


6. Potential dangers of the serialization mechanism:

1) because it is numbered according to the reference value, this means that only the reference (address) can determine whether the object will be serialized;

2) Imagine a mutable object that has been serialized, then changes the value of the member of the object and then serializes the object, then the changed object will not be serialized, just write its number (because it has been serialized once);

3) Therefore, the object serialization of Java must adhere to the law: to ensure that the object is completely determined to not change and then serialize, serialization is not changed again!!

4) Test:

Class Son implements Serializable {string Name;int age;public Son (string name, int age) {this.name = Name;this.age = age;} }class Parent implements Serializable {son son; String Name;int age;public Parent (son son, String name, int age) {This.son = Son;this.name = Name;this.age = Age;}} public class Test {public static void print (String s) {System.out.println (s);} public static void Main (string[] args) throws IOException, ClassNotFoundException {try (objectoutputstream Oos = new Objec Toutputstream (New FileOutputStream ("Test.buf")) {son son = new Son ("Tom", 15); Parent father = new parent (son, "Peter", 40); Parent mother = new parent (son, "Mary", "Oos.writeobject", son), son.name = "ChaCha"; Make changes after serialization, then try serializing Oos.writeobject (father), Oos.writeobject (mother); Oos.close ();} Try (objectinputstream ois = new ObjectInputStream (New FileInputStream ("Test.buf"))) {son son = (son) ois.readobject (); Parent father = (parent) ois.readobject (); Parent mother = (parent) ois.readobject ();p rint (son.name); The result is toM instead of Chacha, visible the second time is not really serialized print ("" + (Father.son = = Mother.son));}} 


7. Serialized version:

1) Imagine how to read an older version of an object that you previously stored in a node if your class expands and upgrades later. Are they compatible? What should I do if I am not compatible? This involves the versioning problem of serialization;

2) version number of the serialized class:

I. All classes marked with Serializable are hidden and contain a version number: private static final long serialversionuid;

II. Of course, the value of this version number can be defined on its own, such as directly in the class definition: private static final long serialversionuid = 2016L; The serialized version of this class is good, it's 2016.

III. If you do not display the version number, then the JVM will generate a version number by default (it may not make sense at all, a negative 20-bit integer is possible);

3) The role of the version number: When serializing, the version number of the corresponding class of the object is also written, and when deserializing, the JVM will see if the current version number of the class is the same as when it was written, and if different then reject the serialization and throw an exception!

!! This leads to a very important rule: if you want to make the software more robust, you have to manually display the definition version number! If you let the JVM give you a version number by default, you might get a different version number from another computer or another version of the JVM (the JVM calculates that version number based on the current environment) so that even the same code can cause incompatible versions of deserialization!

4) JDK View tool for serialized version number of class: The Serialver command in the bin directory of the JDK, its usage: Serialver class name//Fangke returns the serialized version number of the class, note that the class name should contain the full package path! So adjust the current path and run the command again.

!! or execute the command: Serialver-show, will open a graphical interface dialog box, let you fill in the full classpath, and then click on the button to display the version number in the text box below, and the class name is also for the current PWD!

5) Some upgrades to classes can cause serialization to fail even if you do not change the version number:

I. It is not affected if only the method is upgraded and the data member is not changed;//This is inevitable, because serializing is only data, excluding methods

II. It is not affected if only the static variables of the class are modified;//This is also obvious, because the serialized object is not the class itself!

III. If the updated class is just a few data members (other invariant) that are less than the old one, the ones that are missing are directly discarded.

Iv. If the updated analogy is more old than some data members (other invariant) is not affected;//deserialization is the extra data member filled with null or zeros

!! For other situations (such as changing the order in which data members are defined, a variable changing the type, and so on), even if the version number is not changed, it can cause deserialization to fail, because it is obvious that these changes will result in the dislocation of the member immediately after the old data is populated with the object!

!! Therefore, we need to provide a new version of these changes, the old version of the data is read with the old version of the class, to be completely separate from the new version (simply will not be able to use the new version of the class to accept the old version of the data on the node, that is, incompatible);

[the highest state of the crazy java]i/o:i/o stream-Object Flow (serialization: Manual serialization, automatic serialization, reference serialization, version)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.