Advanced understanding of Java serialization

Source: Internet
Author: User

Introduction:If you only know the object implementing the serializable interface, you can serialize it to a local file. You 'd better read this article again.ArticleThe article further discusses serialization with actual examples.CodeThis article describes the advanced understanding of serialization, including the parent class serialization, static variable, transient keyword, and serialization ID. In the actual development process, I have encountered serialization problems for many times, and will share it with readers in this article.

Java serialization technology that serializes Java objects into binary files is an important technical point in the Java series. In most cases, developers only need to understand that the serialized class must implement the serializable interface, and use objectinputstream and objectoutputstream to read and write objects. However, in some cases, it is far from enough to know this. The article lists some real situations encountered by the author.
Serialization-related: by analyzing the causes of the situation, readers can easily remember some advanced understandings in Java serialization.

 

This article will introduce several situations one by one, in the following order.

    • Serialization ID Problems
    • Static variable serialization
    • Serialization of parent classes and transient keywords
    • Encrypt sensitive fields
    • Serialization storage rules

Each part of the List describes a separate situation, which can be viewed separately by the reader.

 

SituationTwo clients, A and B, attempt to transmit object data through the network. A serializes Object C into binary data and then transmits it to B. B deserializes the object to obtain c.

Problem: Assume that the full class path of the C object is com. inout. Test. Such a class file exists on both client a and client B, and the function code is exactly the same. The serializable interface is also implemented, but the deserialization always prompts that it is not successful.

Solution:Whether the Virtual Machine permits deserialization depends not only on whether the class path and function code are consistent, but also on whether the serialization IDs of the two classes are consistent (that is, Private Static final long serialversionuid = 1l). In listing 1, although the functional codes of the two classes are completely the same, the serialization IDs are different and they cannot be serialized or deserialized.

 

Important:

To put it simply, Java's serialization Mechanism judges classes at runtime.SerialversionuidTo verify version consistency. During deserialization, JVM will set the serialversionuid In the byte streamSerialversionuidFor comparison, if they are the same, they are considered to be consistent and can be deserialized. Otherwise, the serialization version is inconsistent.

When the entity (class) implementing the java. Io. serializable interface does not explicitly defineSerialversionuid, Type:LongJava serialization will automatically generateSerialversionuidFor serialized versions. In this case, only the same compiled class will generate the sameSerialversionuid.

If we do not want to forcibly divide the software version by compiling, that is, the entity implementing the serialization interface can be compatible with the previous version. If the class is not changed, we need to explicitly define a class namedSerialversionuid, Type:LongSerialized entities that do not modify the value of this variable can be serialized and deserialized to each other.

 

Listing 1. Comparison of classes with different serialization IDs for the same function code

Package COM. inout; import Java. io. serializable; public class A implements serializable {Private Static final long serialversionuid = 1l; private string name; Public String getname () {return name;} public void setname (string name) {This. name = Name ;}} package COM. inout; import Java. io. serializable; public class A implements serializable {Private Static final long serialversionuid = 2L; private string name; Public String getname () {return name;} public void setname (string name) {This. name = Name ;}}

 

Serialization ID provides two generation policies in eclipse. One is fixed 1l, and the other is random generation of a non-repeated long type data (actually generated using JDK ), we recommend that you use the default 1l if there is no special requirement. This ensures that deserialization is successful when the code is consistent. So what is the function of the randomly generated serialization ID? Sometimes, by changing the serialization ID, it can be used to limit the usage of some users.

Feature Use Cases

The reader should have heard of the fa c Ade mode, which is for the ApplicationProgramProvides a unified access interface. The Client client in the case program uses this mode, as shown in figure 1 of the case program.

Figure 1. Case program structure

The client can interact with the business logic object through the fa-ade object. The fa c Ade object of the client cannot be directly generated by the client, but must be generated by the server. After serialization, the binary object data is transmitted to the client through the network, the client is responsible for deserializing the fa c Ade object. This mode allows the use of client programs to require server-side licensing, while both client and server-side
The fa c Ade object class must be consistent. To update the server version, you only need to generate the serialization ID of the server's fa C Ade object class again, and the client will fail to deserialize the fa c Ade object, that is, force the client to obtain the latest program from the server.

Back to Top

Static variable serialization

Situation: View the code in Listing 2.

List 2. code for static variable serialization

Public class test implements serializable {Private Static final long serialversionuid = 1l; public static int staticvar = 5; public static void main (string [] ARGs) {try {// initially staticvar is 5 objectoutputstream out = new objectoutputstream (New fileoutputstream ("result. OBJ "); out. writeobject (new test (); out. close (); // After serialization, change it to 10test. staticvar = 10; objectinputstream oin = new objectinputstream (New fileinputstream ("result. OBJ "); test T = (TEST) oin. readobject (); oin. close (); // read again, through T. staticvar prints the new value system. out. println (T. staticvar);} catch (filenotfoundexception e) {e. printstacktrace ();} catch (ioexception e) {e. printstacktrace ();} catch (classnotfoundexception e) {e. printstacktrace ();}}}

 

In the main method in Listing 2, after the object is serialized, modify the value of the static variable, read the serialized object, and then obtain and print the value of the static variable through the read object. According to listing 2, does the system. Out. println (T. staticvar) Statement output 10 or 5?

The final output is 10. For unintelligible readers, the printed staticvar is obtained from the read object and should be in the Saved state. The reason for printing 10 is that static variables are not saved during serialization, which is easy to understand. serialization stores the object state and static variables are in the class State. ThereforeSerialization does not save static variables.

Back to Top

Serialization of parent classes and transient keywords

Situation: A subclass implements the serializable interface. Its parent class does not implement the serializable interface. It serializes the subclass object and then deserializes it to output the value of a variable defined by the parent class, the variable value is different from the value during serialization.

Solution:To serialize the parent class object, you need to make the parent class also implement the serializable interface.. If the parent class is not implementedA default parameter-free constructor is required.. When the parent class does not implement the serializable interface, the virtual machine will not serialize the parent object, and the construction of a Java object must first have a parent object to have sub-objects. deserialization is no exception. In deserialization, in order to construct the parent object, only the non-argument constructor of the parent class can be called as the default parent object. Therefore, when we take the variable value of the parent object, its value is the value after the constructor of the parent class is called. If you consider this serialization, initialize the variables in the non-argument constructor of the parent class. Otherwise, the values of the parent class variables are declared by default, as shown in
The default value of int type is 0, and that of string type is null.

The transient keyword is used to control variable serialization. Adding this keyword before the variable declaration can prevent the variable from being serialized into the file. After deserialization, the value of the transient variable is set to the initial value. For example, if the int type is 0, the object type is null.

Feature Use Cases

We are familiar with using the transient keyword to prevent the field from being serialized. Is there any other method? According to the serialization rules of the parent class object, we can extract fields that do not need to be serialized into the parent class. The subclass implements the serializable interface. The parent class is not implemented. According to the parent class serialization rules, the field data of the parent class will not be serialized to form Class 2.

Figure 2. Case program class diagram

It can be seen that attr1, attr2, attr3, and attr5 will not be serialized. The advantage in the parent class is that when there is another child class, attr1, attr2, and attr3 will not be serialized, you do not need to repeat transient and the code is concise.

Back to Top

Encrypt sensitive fields

Situation: The server sends the serialized object data to the client. Some data in the object is sensitive, such as the password string. You want to encrypt this field during serialization, if the client has a decryption key, the password can be read only when the client is deserialized. This ensures the data security of the serialized object to a certain extent.

Solution: In the serialization process, the virtual opportunity attempts to call the writeobject and readobject methods in the object class for custom serialization and deserialization. If there is no such method, the default call is the defaultwriteobject method of objectoutputstream and the defaultreadobject method of objectinputstream. User-Defined
The writeobject and readobject methods allow you to control the serialization process. For example, you can dynamically change the serialization value during the serialization process. Based on this principle, it can be used in practical applications for encryption of sensitive fields. Listing 3 shows this process.

Listing 3. static variable serialization problem code

Private Static final long serialversionuid = 1l; private string Password = "pass"; Public String GetPassword () {return password;} public void setpassword (string password) {This. password = password;} private void writeobject (objectoutputstream out) {try {putfield putfields = out. putfields (); system. out. println ("original password:" + password); Password = "encryption"; // simulate encryption putfields. put ("password", password); system. out. println ("encrypted password" + password); out. writefields ();} catch (ioexception e) {e. printstacktrace () ;}} private void readobject (objectinputstream in) {try {getfield readfields = in. readfields (); object = readfields. get ("password", ""); system. out. println ("string to be decrypted:" + object. tostring (); Password = "pass"; // simulate decryption and obtain the local key} catch (ioexception e) {e. printstacktrace ();} catch (classnotfoundexception e) {e. printstacktrace () ;}} public static void main (string [] ARGs) {try {objectoutputstream out = new objectoutputstream (New fileoutputstream ("result. OBJ "); out. writeobject (new test (); out. close (); objectinputstream oin = new objectinputstream (New fileinputstream ("result. OBJ "); test T = (TEST) oin. readobject (); system. out. println ("decrypted string:" + T. getPassword (); oin. close ();} catch (filenotfoundexception e) {e. printstacktrace ();} catch (ioexception e) {e. printstacktrace ();} catch (classnotfoundexception e) {e. printstacktrace ();}}

 

In the writeobject method in listing 3, the password is encrypted, and the password is decrypted in readobject. Only clients with the key can correctly parse the password, ensures data security. After listing 3 is executed, the console outputs 3.

Figure 3. Data Encryption demonstration

Feature Use Cases

RMI technology is completely based on Java serialization technology. The parameter objects required for server-side interface calls come from clients, and they are transmitted through the network. This involves the secure transmission of RMI. Some sensitive fields, such as user name and password (the user needs to transmit the password during login), we want to encrypt it. At this time, you can use the methods described in this section to encrypt the password on the client and decrypt the password on the server to ensure the security of data transmission.

Back to Top

Serialization storage rules

Situation: The problem code is shown in Listing 4.

Listing 4. Storage rule issue code

Objectoutputstream out = new objectoutputstream (New fileoutputstream ("result. OBJ "); test = new test (); // tries to write the object to the file out twice. writeobject (TEST); out. flush (); system. out. println (new file ("result. OBJ "). length (); out. writeobject (TEST); out. close (); system. out. println (new file ("result. OBJ "). length (); objectinputstream oin = new objectinputstream (New fileinputstream ("result. OBJ "); // read two files from the file: Test T1 = (TEST) oin. readobject (); test t2 = (TEST) oin. readobject (); oin. close (); // determine whether two references point to the same object system. out. println (T1 = t2 );

 

In listing 3, the same object is written to a file twice, the storage size after the object is written and the storage size after the object is written twice are printed, and then two objects are deserialized from the file, compare whether the two objects are the same object. The general idea is that when two objects are written, the file size will change to twice. During deserialization, two objects are generated due to reading from the file, if the values are equal, the input value is false, but the output value is 4.

Figure 4. Sample program output

We can see that the second write object only adds 5 bytes, and the two objects are equal. Why?

Answer: Java serialization mechanism has specific storage rules to save disk space. When a file is written into the same object, the object content is no longer stored, instead, store a reference again. The 5-byte storage space added above is the space for adding reference and some control information. During deserialization, the reference relationship is restored so that t1 and t2 in listing 3 point to a unique object. The two are equal and the output is true. This storage rule greatly saves storage space.

Feature Case Analysis

View the code in listing 5.

Listing 5. Case code

Objectoutputstream out = new objectoutputstream (New fileoutputstream ("result. OBJ "); test = new test (); test. I = 1; out. writeobject (TEST); out. flush (); test. I = 2; out. writeobject (TEST); out. close (); objectinputstream oin = new objectinputstream (New fileinputstream ("result. OBJ "); test T1 = (TEST) oin. readobject (); test t2 = (TEST) oin. readobject (); system. out. println (t1. I); system. out. println (t2. I );

 

The purpose of Listing 4 is to save the test object twice to result. in the OBJ file, modify the object property value after writing the object for the second time, and then save it from result. in OBJ, the two objects are read in sequence and the I attribute values of these two objects are output. The purpose of the case code is to transfer the object status before and after modification at a time.

The two outputs are 1, because after the object is written for the first time, when the second attempt is made, the VM knows that an identical object has been written to the file based on the reference relationship, therefore, only the reference for the second write is saved. Therefore, the object is saved for the first time during reading. You need to pay special attention to this issue when using a file multiple times writeobject.

Back to Top

Summary

This article introduces some advanced Java serialization knowledge through several specific scenarios, although it is advanced, it does not mean that the readers do not understand it. I hope to use the scenario described by the author to make readers more impressed and make more reasonable use of Java serialization technology to encounter serialization problems on the way to future development, it can be solved in a timely manner. Due to my limited level of knowledge, if there is any mistake in the article, please contact me for criticism and correction.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.