[Turn]: http://hi.baidu.com/charmred/blog/item/46d57483be34aab66c8119e7.html
Introduction
Serialization refers to the process of storing the status of the object instance to the storage media. In this process, the public and private fields of the object and the name of the class (includingProgramSet), and then write the byte stream into the data stream. When the object is deserialized, a copy identical to the original object will be created.
The serialization mechanism must be balanced between ease of use and flexibility in an object-oriented environment. As long as you have sufficient control capabilities for this process, it can be automatically implemented to a large extent. For example, simple binary serialization cannot meet the requirements, or, for specific reasons, determine the fields in the class to be serialized. The following sections describe the reliable serialization mechanism provided by the. NET Framework and focus on some important features that enable you to customize the serialization process as needed.
Persistent Storage
We often need to save the field value of the object to the disk and retrieve the data later. Although this can be done without serialization, this method is often cumbersome and error-prone, and will become more and more complex when you need to trace the object hierarchy. Imagine writing a large business application containing a large number of objects. programmers have to writeCodeTo save the fields and attributes to the disk and restore them from the disk. Serialization provides a quick way to easily achieve this goal.
The Common Language Runtime (CLR) manages the distribution of objects in the memory. the. NET Framework uses reflection to provide an automatic serialization mechanism. After the object is serialized, the class name, assembly, and all data members of the class instance are written to the storage media. Objects usually use member variables to store references to other instances. After the class is serialized, the serialization engine tracks all serialized reference objects to ensure that the same object is not serialized multiple times .. The serialization architecture provided by the. NET Framework can automatically and correctly process object charts and circular references. The only requirement for object charts is that all objects referenced by objects being serialized must be marked as serializable (see Basic serialization ). Otherwise, an exception occurs when the serialization program attempts to serialize unlabeled objects.
When deserializing A serialized class, the class is re-created and the values of all data members are automatically restored.
Send by value
The object is valid only in the application domain of the created object. Unless the object is derived from marshalbyrefobject or marked as serializable, any attempt to pass the object as a parameter or return it as a result will fail. If the object is marked as serializable, the object will be automatically serialized, transmitted from one application domain to another application domain, and then deserialized, in this way, an exact copy of the object is generated in the second application domain. This process is usually called value-based mail.
If the object is derived from marshalbyrefobject, the object reference, not the object itself, is passed from one application domain to another application domain. You can also mark the object derived from marshalbyrefobject as serializable. When this object is used remotely, the formatting program that is responsible for serialization and is pre-configured as surrogateselector controls the serialization process and replaces all objects derived from externalbyrefobject with a proxy. If it is not pre-configured as surrogateselector, the serialization architecture follows the following standard serialization rules (see the serialization procedure ).
Basic serialization
To make a class serializable, the simplest way is to mark it with the serializable attribute, as shown below:
[Serializable]
Public class myobject {
Public int n1 = 0;
Public int n2 = 0;
Public String STR = NULL;
}
The following code snippet illustrates how to serialize an instance of this type into a file:
Myobject OBJ = new myobject ();
OBJ. n1 = 1;
OBJ. n2 = 24;
OBJ. Str = "some strings ";
Iformatter formatter = new binaryformatter ();
Stream stream = new filestream ("myfile. bin", filemode. Create,
Fileaccess. Write, fileshare. None );
Formatter. serialize (stream, OBJ );
Stream. Close ();
In this example, the binary formatting program is used for serialization. You only need to create an instance of the stream and formatting program to be used, and then call the serialize method of the formatting program. The stream and the object instance to be serialized are provided to this call as parameters. All member variables (or even private variables) in the class will be serialized, but this is not explicitly reflected in this example. In this regard, binary serialization is different from the XML serialization program that only serializes public fields.
It is also easy to restore an object to its previous state. First, create a formatter and a stream for reading, and then let the formatter deserialize the object. The following code snippet describes how to perform this operation.
Iformatter formatter = new binaryformatter ();
Stream stream = new filestream ("myfile. bin", filemode. Open,
Fileaccess. Read, fileshare. Read );
Myobject OBJ = (myobject) formatter. deserialize (fromstream );
Stream. Close ();
// The following is proof
Console. writeline ("N1: {0}", obj. N1 );
Console. writeline ("N2: {0}", obj. N2 );
Console. writeline ("str: {0}", obj. Str );
The preceding binaryformatter is highly efficient and can generate compact byte streams. All objects serialized using this formatting program can also be deserialized using it. This formatting program is undoubtedly an ideal tool for deserializing objects that will be deserialized on the. NET platform. It should be noted that the constructor is not called when the object is deserialized. This constraint is added to deserialization for performance considerations. However, this violates some runtime conventions used by object writers. Therefore, when marking objects as serializable, developers should ensure that this special convention is taken into account.
If portability is required, use soapformatter. All you need to do is change the formatting program in the above Code to soapformatter, while the calls of serialize and deserialize remain unchanged. For the example used above, the formatter will generate the following results.
<SOAP-ENV: Envelope
Xmlns: xsi = http://www.w3.org/2001/XMLSchema-instance
Xmlns: XSD = "http://www.w3.org/2001/XMLSchema"
Xmlns: Soap-ENC = http://schemas.xmlsoap.org/soap/encoding/
Xmlns: Soap-Env = http://schemas.xmlsoap.org/soap/envelope/
SOAP-ENV: encodingstyle =
Http://schemas.microsoft.com/soap/encoding/clr/1.0
Http://schemas.xmlsoap.org/soap/encoding"
Xmlns: A1 = "http://schemas.microsoft.com/clr/assem/ToFile">
SOAP-ENV: Body>
<A1: myobject>
<N1> 1 </N1>
<N2> 24 </N2>
<STR> some strings </STR>
</A1: myobject>
SOAP-ENV: Body>
SOAP-ENV: envelope>
Note that the serializable attribute cannot be inherited. If a new class is derived from myobject, the new class must also be marked with this attribute; otherwise, serialization will fail. For example, if you try to serialize the following class instances, a serializationexception is displayed, indicating that the mystuff type is not marked as serializable.
Public class mystuff: myobject
{Public int N3 ;}
The serialization attribute is very convenient, but it has some restrictions described above. For more information about when to mark classes for serialization (because classes cannot be serialized after compilation), see the description (see serialization rules below ).
Selective serialization
The class usually contains fields that should not be serialized. For example, assume that a class uses a member variable to store the thread ID. When this class is deserialized, the thread corresponding to the ID stored in the serialization class may no longer run, so serialization of this value is meaningless. You can use the nonserialized attribute to mark member variables to prevent them from being serialized, as shown below:
[Serializable]
Public class myobject
{
Public int N1;
[Nonserialized] public int N2;
Public String STR;
}
Custom serialization
You can customize the serialization process by implementing the iserializable interface on the object. This function is especially useful when the value of the member variable becomes invalid after deserialization, but you need to provide the value for the variable to recreate the complete state of the object. To implement iserializable, you must implement the getobjectdata method and a special constructor. This constructor is used in deserialization of objects. The following code example shows how to implement iserializable on the myobject class mentioned in the previous section.
[Serializable]
Public class myobject: iserializable
{
Public int N1;
Public int N2;
Public String STR;
Public myobject (){}
Protected myobject (serializationinfo info, streamingcontext context)
{
N1 = info. getint32 ("I ");
N2 = info. getint32 ("J ");
STR = info. getstring ("K ");
}
Public Virtual void getobjectdata (serializationinfo info,
Streamingcontext context)
{
Info. addvalue ("I", N1 );
Info. addvalue ("J", N2 );
Info. addvalue ("K", STR );
}
}
When calling getobjectdata during serialization, you need to fill in the serializationinfo object provided in the method call. You only need to add the variable to be serialized in the form of name/value pairs. The name can be any text. As long as the serialized data is sufficient to restore the object during the deserialization process, you can freely select the member variable added to serializationinfo. If the base object implements iserializable, the derived class should call the getobjectdata method of the base object.
It must be emphasized that when adding iserializable to a class, getobjectdata and special constructors must be implemented at the same time. If getobjectdata is missing, the compiler sends a warning. However, the constructor cannot be implemented forcibly, so no warning is given when the constructor is missing. If you try to deserialize a class without a constructor, an exception will occur. In terms of eliminating potential security and version control problems, the current design is better than the setobjectdata method. For example, if you define the setobjectdata method as part of an interface, this method must be a public method, so that you have to write code to prevent multiple calls to the setobjectdata method. As you can imagine, if an object is performing some operations, but a malicious application calls the setobjectdata method of this object, it will cause some potential troubles.
In the deserialization process, use the constructor provided for this purpose to pass serializationinfo to the class. During object deserialization, any visibility constraints on constructors are ignored. Therefore, classes can be marked as public, protected, internal, or private. A good way is to mark the constructor as protect if the class is not encapsulated. If the class has been encapsulated, it should be marked as private. To restore the object state, you only need to use the name used during serialization to retrieve the value of the variable from serializationinfo. If the base class implements iserializable, you should call the base class constructor so that the base object can restore its variables.
If a new class is derived from the class that implements iserializable, the constructor and the getobjectdata method must be implemented as long as the new class contains any variable to be serialized. The following code snippet shows how to use the myobject class shown above to complete this operation.
[Serializable]
Public class objecttwo: myobject
{
Public int num;
Public objecttwo (): Base (){}
Protected objecttwo (serializationinfo Si, streamingcontext context ):
Base (Si, context)
{Num = Si. getint32 ("num ");}
Public override void getobjectdata (serializationinfo Si,
Streamingcontext context)
{
Base. getobjectdata (Si, context );
Si. addvalue ("num", num );
}
}
Remember to call the base class in the deserialization constructor. Otherwise, the constructor on the base class will never be called, and the complete object cannot be constructed after deserialization.
Objects are completely re-built, but calling methods in the deserialization process may bring adverse side effects, because the called methods may reference object references that are not deserialized at the call fashion. If the deserialization class implements ideserializationcallback, The onserialization method is automatically called after the entire object chart is deserialized. All referenced sub-objects are completely restored. Some classes do not use the above event listeners, so it is difficult to deserialize them. The hash is a typical example. It is very easy to retrieve keyword/value pairs during deserialization. However, the classes derived from the hash cannot be deserialized, therefore, some problems may occur when these objects are added to the return list. Therefore, we recommend that you do not call methods on the hash table.
Serialization procedure
When the serialize method is called on the formatting program, Object serialization follows the following rules:
Check whether the formatting program has a proxy selector. If yes, check whether the proxy Selector Processes objects of the specified type. If the Selector Processes this object type, iserializable. getobjectdata is called on the proxy selector.
If no proxy selector is available or you do not process this type, you will check whether the serializable attribute is used to mark the object. If it is not marked, serializationexception is thrown.
If the object has been correctly marked, check whether the object has implemented iserializable. If implemented, getobjectdata is called on the object.
If the object does not implement serializable, the default serialization policy is used to serialize all fields not marked as nonserialized.
Version Control
The. NET Framework supports version control and side-by-side execution. If the class interfaces are consistent, all classes can work across versions. Serialization involves member variables rather than interfaces. Therefore, exercise caution when adding or deleting a member variable to a class to be serialized across versions. This is especially true for classes that do not implement iserializable. If the status of the current version changes (such as adding member variables, changing the variable type, or changing the variable name), it means that if existing objects of the same type are serialized using an earlier version, then they cannot be deserialized.
If the object state needs to change between different versions, the class author can have two options:
Implement iserializable. This allows you to precisely control the serialization and deserialization processes and correctly add and interpret future states during the deserialization process.
Use the nonserialized attribute to mark unimportant member variables. This option can be used only when the expected class changes between different versions. For example, after adding a new variable to a higher version of the class, you can mark the variable as nonserialized to ensure the class is compatible with earlier versions.
Serialization rules
Since classes cannot be serialized after compilation, serialization should be considered when designing new classes. The question to consider is: Must this class be sent across application domains? Do you want to remotely use this class? How will users use this class? Maybe they will generate a new class to be serialized from my class. As long as this possibility exists, classes should be marked as serializable. It is recommended to mark all classes as serializable except in the following cases:
All classes will never span application domains. If a class does not require serialization but must span application domains, derive this class from marshalbyrefobject.
Class storage only applies to the special pointers of its current instance. For example, if a class contains uncontrolled memory or file handles, make sure that these fields are marked as nonserialized or not serialized at all.
Some data members contain sensitive information. In this case, we recommend that you implement iserializable and only serialize the required fields.