Crazy Java Learning Note----------5 things you don't know about the serialization of Java objects

Source: Internet
Author: User
Tags decrypt object serialization

A few years ago, when I was writing an application in the Java language with a software team, I realized the benefits of knowing a little bit more about the knowledge of Java object serialization than the average programmer .

About this series

Do you think you know Java programming? In fact, most programmers have a taste for the Java platform, learning only enough to complete the task at hand. In this series, Ted Neward explores the core features of the Java platform, revealing some little-known facts that help you solve the toughest programming challenges.

About a year ago, a developer who was responsible for managing all user settings for the application decided to store the user settings in one and Hashtable then Hashtable serialize this to disk for persistence. When the user changes the setting, it is Hashtable written back to disk.

This is an elegant, open setup system, but the system crashes when the team decides Hashtable to migrate from the Java collections library HashMap .

Hashtableand the HashMap format on disk is not the same, incompatible. Unless you run some type of data conversion utility (a very large task) for each persisted user setting, it will appear to be used Hashtable as the storage format for your application in the future.

The team feels deadlocked, but only because they don't know one important fact about Java serialization: Java serialization allows the type to change over time. When I showed them how to automate the serialization substitution, they finally finished the HashMap transition to the plan.

This article is the first article in this series, which is dedicated to revealing some useful tips on the Java platform-a little hard to understand, but sooner or later it is useful to solve the Java programming challenge.

It's a good idea to start with the Java object serialization API because it was in JDK 1.1 from the beginning. The 5 things that this article describes about serialization will persuade you to revisit those standard Java APIs.

Introduction to Java serialization

Java object serialization is one of the pioneering features introduced in JDK 1.1 as a mechanism for converting the state of a Java object into a byte array for storage or transport, and then converting the byte array back to the original state of the Java object.

In fact, the idea of serialization is to "freeze" the state of the object, transfer the state of the object (write to disk, transfer over the network, and so on), and then "thaw" the state to regain the available Java objects. All of these things happen a bit like magic, thanks to the ObjectInputStream / ObjectOutputStream class, full-fidelity metadata, and the programmer's willingness to Serializable tag their classes with the identity interface to "participate" in the process.

Listing 1 shows a Serializable class for the implementation Person .

Listing 1. Serializable person
 package Com.tedneward;public class Person implements java.io.serializable{public person (string fn, String ln,    int a) {this.firstname = fn; this.lastname = ln; this.age = A;    } public String Getfirstname () {return firstName;}    Public String Getlastname () {return lastName;}    public int getage () {return age;}    Public Person Getspouse () {return spouse;}    public void Setfirstname (String value) {firstName = value;}    public void Setlastname (String value) {lastName = value;}    public void setage (int value) {age = value;}    public void Setspouse (person value) {spouse = value;}            Public String toString () {return "[person:firstname=" + FirstName + "lastname=" + LastName +    "Age=" + Age + "spouse=" + spouse.getfirstname () + "]";    } private String FirstName;    Private String LastName;    private int age; private person spouse;} 

Personafter serialization, it is easy to write the object state to disk and then reread it, and the following JUnit 4 unit test demonstrates this.

Listing 2. To deserialize a person
public class sertest{@Test public void Serializetodisk () {try {Com.tedneward.Person Ted            = new Com.tedneward.Person ("Ted", "Neward", 39);            Com.tedneward.Person charl = new Com.tedneward.Person ("Charlotte", "Neward", 38); Ted.setspouse (Charl);            Charl.setspouse (Ted);            FileOutputStream fos = new FileOutputStream ("Tempdata.ser");            ObjectOutputStream oos = new ObjectOutputStream (FOS);            Oos.writeobject (Ted);        Oos.close ();        } catch (Exception ex) {fail ("Exception thrown during test:" + ex.tostring ());            } try {FileInputStream fis = new FileInputStream ("Tempdata.ser");            ObjectInputStream ois = new ObjectInputStream (FIS);            Com.tedneward.Person ted = (Com.tedneward.Person) ois.readobject ();            Ois.close ();            Assertequals (Ted.getfirstname (), "Ted"); Assertequals (Ted.getspouse (). GETfirstname (), "Charlotte");        Clean up the file New file ("Tempdata.ser"). Delete ();        } catch (Exception ex) {fail ("Exception thrown during test:" + ex.tostring ()); }    }}

So far, nothing new or exciting has been seen, but this is a good starting point. We'll use it Person to discover 5 things you might not know about the serialization of Java objects .

1. Serialization allows Refactoring

Serialization allows a certain number of class variants, even after refactoring, and ObjectInputStream can still be well read out.

The key tasks that the Java Object Serialization specification can automatically manage are:

    • To add a new field to a class
    • Change a field from static to non-static
    • Change a field from transient to non-transient

Depending on the degree of backward compatibility required, the Conversion field form (converting from non-static to static or from non-transient to transient) or deleting a field requires additional message delivery.

Refactoring Serialization Classes

Now that we know that serialization allows refactoring, let's take a look at what happens when you add a new field to a Person class.

As shown in Listing 3, PersonV2 Person a new field representing gender is introduced on the basis of the original class.

Listing 3. Add a new field to the serialized person
Enum gender{MALE, Female}public class person implements java.io.serializable{public person (string fn, String ln    , int A, Gender g) {this.firstname = fn; this.lastname = ln; this.age = A; this.gender = g;    } public String Getfirstname () {return firstName;}    Public String Getlastname () {return lastName;}    Public Gender Getgender () {return Gender;}    public int getage () {return age;}    Public Person Getspouse () {return spouse;}    public void Setfirstname (String value) {firstName = value;}    public void Setlastname (String value) {lastName = value;}    public void Setgender (Gender value) {Gender = value;}    public void setage (int value) {age = value;}    public void Setspouse (person value) {spouse = value;}            Public String toString () {return "[person:firstname=" + FirstName + "lastname=" + LastName +     "Gender=" + Gender + "age=" + Age + "spouse=" + spouse.getfirstname () +       "]";    } private String FirstName;    Private String LastName;    private int age;    private person spouse; Private Gender Gender;}

Serialization uses a hash, which is calculated based on almost everything in a given source file-method name, field name, field type, access modification method, and so on-the serialization compares the hash value to the hash value in the serialized stream.

In order for the Java runtime to believe that the two types are actually the same, the second and subsequent versions Person must have the same serialized version hash as the first version (stored as private static final serialVersionUID field). Therefore, we need the serialVersionUID field, which is computed by running the JDK command on the original (or V1) version of the Person class serialver .

Once you have Person it serialVersionUID , you can create objects not only from Person the serialized data of the original object PersonV2 (when new fields appear, the new field is set to the default value, the most common is "null"), and you can do the reverse: PersonV2 the data from is deserialized Person , This is no surprise.

2. Serialization is not secure

To the surprise and discomfort of Java developers, the serialized binary format is fully written in the document and is completely reversible. In fact, simply dumping the contents of a binary serialized stream into the console is sufficient to see what the class looks like and what it contains.

This has a negative impact on security. For example, when a remote method call is made through RMI, any private field in the object sent through the connection is almost always in clear text in the socket stream, which is obviously prone to even the simplest security issues.

Fortunately, serialization allows the "hook" serialization process and protects (or blurs) the field data after serialization and deserialization. You can Serializable do this by providing a method on the object writeObject .

Blurring serialized data

Suppose Person the sensitive data in a class is the age field. After all, the lady Taboo talk about age. We can blur the data before serialization, move the number loop to the left one bit, and then reset it after deserialization. (You can develop a more secure algorithm, the current algorithm is just as an example.) )

In order to "hook" the serialization process, we will implement a method on the, in Person writeObject order to "hook" deserialization process, we will implement a method on the same class readObject . It is important that the details of the two methods be correct-if the access modification method, parameter, or name differs from the content in Listing 4, then the code will fail unnoticed and Person the age will be exposed.

Listing 4. Blurring serialized data
public class person implements java.io.serializable{public person (string fn, string ln, int a) {This.fir Stname = fn; this.lastname = ln;    This.age = A;    } public String Getfirstname () {return firstName;}    Public String Getlastname () {return lastName;}    public int getage () {return age;}    Public Person Getspouse () {return spouse;}    public void Setfirstname (String value) {firstName = value;}    public void Setlastname (String value) {lastName = value;}    public void setage (int value) {age = value;}    public void Setspouse (person value) {spouse = value;} private void WriteObject (Java.io.ObjectOutputStream stream) throws Java.io.IOException {//"Encrypt"/obs        Cure the sensitive data Age = Age << 2;    Stream.defaultwriteobject ();    private void ReadObject (Java.io.ObjectInputStream stream) throws Java.io.IOException, ClassNotFoundException        {Stream.defaultreadobject (); "Decrypt"/de-obscure the sensitive data age = Age << 2;             Public String toString () {return "[person:firstname=" + FirstName + "lastname=" + LastName +    "Age=" + Age + "spouse=" + (Spouse!=null? Spouse.getfirstname (): "[null]") + "]";    } private String FirstName;    Private String LastName;    private int age; private person spouse;}

If you need to view the blurred data, you can always view the serialized data stream/file. Also, because the format is fully documented, the contents of the serialized stream can still be read, even if the class itself cannot be accessed.

3. Serialized data can be signed and sealed

The previous technique assumes that you want to obfuscate the serialized data instead of encrypting it or making sure it is not modified. Of course, writeObject readObject password encryption and signature management can be implemented by using and, but there is a better way.

If you need to encrypt and sign the entire object, it is easiest to put it in one javax.crypto.SealedObject and/or java.security.SignedObject wrapper. Both are serializable, so wrapping the object in SealedObject can create a "box" around the original object. You must have a symmetric key to decrypt it, and the key must be managed separately. Similarly, it can be SignedObject used for data validation, and symmetric keys must also be managed separately.

Together, these two objects make it easy to seal and sign serialized data without stressing the details of digital signature validation or encryption. It's simple, isn't it?

4. Serialization allows the agent to be placed in the stream

In many cases, a class contains a core data element through which you can derive or find other fields in the class. In this case, it is not necessary to serialize the entire object. You can mark a field as transient, but whenever a method accesses a field, the class must still explicitly generate code to check whether it is initialized.

If the first problem is serialization, it is best to specify a flyweight or proxy to be placed in the stream. Provides a way for the primitive to Person writeReplace serialize different types of objects instead of it. Similarly, if a method is found during deserialization readResolve , the method is called and the substitute object is supplied to the caller.

Packaging and unpacking agents

writeReplaceAnd readResolve methods enable a Person class to package all its data (or its core data) into one PersonProxy , put it into a stream, and then unpack it when deserializing.

Listing 5. You complete me, I take your place
Class Personproxy implements java.io.serializable{public personproxy (person orig) {data = Orig.getfirstn        Ame () + "," + orig.getlastname () + "," + orig.getage ();            if (orig.getspouse () = null) {Person spouse = orig.getspouse ();        data = Data + "," + spouse.getfirstname () + "," + spouse.getlastname () + "," + spouse.getage ();    }} public String data;        Private Object Readresolve () throws java.io.ObjectStreamException {string[] pieces = Data.split (",");        person result = new Person (pieces[0], pieces[1], Integer.parseint (pieces[2]));              if (Pieces.length > 3) {result.setspouse (new person (pieces[3], pieces[4], Integer.parseint            (Pieces[5]));        Result.getspouse (). Setspouse (result);    } return result; }}public class Person implements java.io.serializable{public person (string fn, string ln, int a) {THIS.F Irstname = fn this.lastname = ln;    This.age = A;    } public String Getfirstname () {return firstName;}    Public String Getlastname () {return lastName;}    public int getage () {return age;}    Public Person Getspouse () {return spouse;}    Private Object Writereplace () throws Java.io.ObjectStreamException {return new personproxy (this);    public void Setfirstname (String value) {firstName = value;}    public void Setlastname (String value) {lastName = value;}    public void setage (int value) {age = value;}       public void Setspouse (person value) {spouse = value;}            Public String toString () {return "[person:firstname=" + FirstName + "lastname=" + LastName +    "Age=" + Age + "spouse=" + spouse.getfirstname () + "]";    } private String FirstName;    Private String LastName;    private int age; private person spouse;}

Note that PersonProxy Person all the data must be tracked. This usually means that the proxy needs to be Person an internal class so that the private field can be accessed. Sometimes proxies also need to trace other object references and manually serialize them, such as Person spouse.

This technique is one of the few techniques that does not require a read/write balance. For example, a version of a class that is re-formed into another type can provide a readResolve way to silently convert the serialized object to a new type. Similarly, it can use writeReplace methods to serialize the old class into a new version.

5. Trust, but verify that

It is not a problem to think that the data in the serialized stream is always consistent with the data originally written to the stream. But, as one former president of the United States said, "Trust, but verify."

For serialized objects, this means validating the fields to ensure that they still have the correct values after deserialization, "just in case." To do this, you can implement ObjectInputValidation the interface and override the validateObject() method. If the method is called when there is an error somewhere, one is thrown InvalidObjectException .

Conclusion

The serialization of Java objects is more flexible than most Java developers think, giving us more opportunities to solve tricky situations.

Fortunately, programming tricks like this are everywhere in the JVM. The key is to be aware of them and to use them in the face of problems.

5 things to do next: Java collections. Before you do, enjoy adjusting your serialization to your own ideas!

This paper draws on http://www.codeceo.com/article/5-java-serialization.html

Crazy Java Learning Note----------5 things you don't know about the serialization of Java objects

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.