Serialization (Overview)

Source: Internet
Author: User
SerializationOverview
Requirement

Other Implementation Solutions

Here, we use terminology Serialization)To represent a group of original C ++ data structures as byte streams for reversible analysis. Such a system can be used to re-establish the original data structure in another program environment. Therefore, it can also be used as the implementation basis of Object persistence, remote parameter passing, or other features. In our system, terminology will be used Archive)Indicates a specific byte stream. Files can be binary files, text files, XML files, or other user-defined types.
Our goal is:
  1. Code portability-relies only on the features of ansi c ++.
  2. Code economics-mining various C ++ features such as rtti, templates, and multi-inheritance makes it easy for users to use and the code is short.
  3. Class version independence. -- When the definition of a class changes, the archives of the class of the old version can still be imported into the new version of the class.
  4. Deep storage and restoration of pointers. -- Save or restore the pointer while saving or restoring the data pointed to by the pointer.
  5. Correctly handle the problem when multiple pointers point to the same object.
  6. Serialization of STL and other common template classes is directly supported.
  7. Data portability-the byte stream created on one platform should also be correct on another platform.
  8. Serialization and file format orthogonal-any format file can be applied as a file without changing the serialization part of the class.
  9. Non-Intrusive implementation is supported. A class does not need to be derived from a specific class or implement a specific member function. This is necessary when we cannot or are unwilling to modify the definition of a class.
  10. ArchivesInterface should be simple enough to make it easy to create new types of files.
  11. ArchivesXML format should be supported.
Other Implementation Solutions

Before I started this job, I found some current implementation solutions.

  • MFCIs an implementation solution that I am very familiar. I have been using it for several years and found it very useful. However, it does not meet the requirements. Even so, this person is the most useful implementation I have found. In addition, I found that the support of the class version-the implementation of this feature in MFC-is indispensable in my program. For example, version 1.x shipping programs often store more information than previously provided data. MFC is the only implementation solution that supports version functions-although it can only be used for the latest generation class (most derived class ). But there is always better than none. In addition, MFC does not support STL container serialization. It serves only the MFC container.
  • Commonc ++ Libraries[1] This is close to the implementation of MFC and solves some of the problems of MFC. It is portable. It creates portable files but does not support the version function. It correctly handles governing storage and supports STL containers. It also solves the archive compression (although I don't like her implementation ). However, this library requires a better document. It does not meet the requirements of, 9.
  • Eternity[2] This is a "Bare library ":-). Its code is beautiful, but it does need a good document and more examples. If you don't fully learn his source code, you don't know how to use it. The latest version supports the XML format. It does not meet the needs of, 7 ?, 8, 9.
  • Holub's implementation[3] It was the first one to let me seriously think about how to implement my own serialization requirements. If you are not arrogant or arrogant, reading it will be quite interesting and worthwhile. It does not meet the requirements of, 6.
  • S11n[13] the goal of this database is quite similar to ours. Some implementations are quite similar. When writing this article, some of its problems are:
    • Code portability (1 ). The code can only be used in the latest GCC version.
    • Class version independence (3 ). The class version is not directly supported.
    • Requirement (5). It does not automatically handle multiple pointers. I also summarized from the document that the same data structure as the graph is not supported.

    It is different from our implementation and has many similarities.

Revised 1 November, 2004

Copyright Robert Ramey 2002-2004. distributed under the boost software license, Version 1.0. (See accompanying file license_00000.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Serialization Guide
A simple example

Non-intrusive version

Serializable members

Derived class

Pointer

Array

STL container

Versioning of Classes

Set serializeUse save/loadSeparate implementation

Archives
Ar <data; AR & data;
An output file is very similar to an output stream. Data can be stored in files using the <or & operator.
Ar> data; AR & data; an input file is similar to an output stream. You can use the> and & operators to obtain data from an archive.

When these operators are used for basic data types, the data is simply stored in/loaded from the file. When used for data of the class typeserializeThe function is called. EachserializeAll functions store and load class data members through the operators above. This is a recursive process until all the storage/loading work has been completed.

A simple example

<And & operator inserializeThe function is used to store and load data members of a class.

Demo. cpp demonstrates how to use our system. The following example shows how to use the library.

#include <fstream>// include headers that implement a archive in simple text format#include <boost/archive/text_oarchive.hpp>#include <boost/archive/text_iarchive.hpp>/////////////////////////////////////////////////////////////// gps coordinate//// illustrates serialization for a simple type//class gps_position{private:    friend class boost::serialization::access;    // When the class Archive corresponds to an output archive, the    // & operator is defined similar to <<.  Likewise, when the class Archive    // is a type of input archive the & operator is defined similar to >>.    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        ar & degrees;        ar & minutes;        ar & seconds;    }    int degrees;    int minutes;    float seconds;public:    gps_position(){};    gps_position(int d, int m, float s) :        degrees(d), minutes(m), seconds(s)    {}};int main() {    // create and open a character archive for output    std::ofstream ofs("filename");    boost::archive::text_oarchive oa(ofs);    // create class instance    const gps_position g(35, 59, 24.567f);    // write class instance to archive    oa << g;    // close archive    ofs.close();    // ... some time later restore the class instance to its orginal state    // create and open an archive for input    std::ifstream ifs("filename", std::ios::binary);    boost::archive::text_iarchive ia(ifs);    // read class state from archive    gps_position newg;    ia >> newg;    // close archive    ifs.close();    return 0;}

Any class to be stored through serialization must have a function to store its members to indicate the status of this class. Correspondingly, a function must be defined for the class that wants to obtain data from the sequence. This function obtains data in the same order as the stored data. In the above example, this function uses a member Template Functionserialize.

Non-intrusive version

The above example uses an intrusive expression, that is, the class definition must be changed to meet serialization requirements. In some cases, this may be inconvenient. There is an equivalent expression in this system:

#include <boost/archive/text_oarchive.hpp>#include <boost/archive/text_iarchive.hpp>class gps_position{public:    int degrees;    int minutes;    float seconds;    gps_position(){};    gps_position(int d, int m, float s) :        degrees(d), minutes(m), seconds(s)    {}};namespace boost {namespace serialization {template<class Archive>void serialize(Archive & ar, gps_position & g, const unsigned int version){    ar & g.degrees;    ar & g.minutes;    ar & g.seconds;}} // namespace serialization} // namespace boost

In this example, the serialize function is no longer a member function. However, it works almost the same way as the functions in the preceding example.

Non-Intrusive is mainly used when the class definition cannot be changed and serialization support is added. To achieve this purpose, the class must provide sufficient information and interfaces to recreate the class state. In this example, we simply use public data members, which is of course uncommon. Only when the class provides sufficient information and interface storage and loading class status can serialization support be added without changing the class definition.

Serializable members

A serializable class with serializable members:

class bus_stop{    friend class boost::serialization::access;    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        ar & latitude;        ar & longitude;    }    gps_position latitude;    gps_position longitude;protected:    bus_stop(const gps_position & lat_, const gps_position & long_) :    latitude(lat_), longitude(long_)    {}public:    bus_stop(){}    // See item # 14 in Effective C++ by Scott Meyers.    // re non-virtual destructors in base classes.    virtual ~bus_stop(){}};

We can see that the serialization method of serializable members is no different from the basic data type.

Note that the bus_stop class also calls the latitude and longpollingserializeFunction, which is defined in gps_position. This method allows the entire data structure to serialize all correctly by storing their root.

Derived class

The derived classes are responsible for calling the serialization functions of their base classes.

#include <boost/serialization/base_object.hpp>class bus_stop_corner : public bus_stop{    friend class boost::serialization::access;    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        // serialize base class information        ar & boost::serialization::base_object<bus_stop>(*this);        ar & street1;        ar & street2;    }    std::string street1;    std::string street2;    virtual std::string description() const    {        return street1 + " and " + street2;    }public:    bus_stop_corner(){}    bus_stop_corner(const gps_position & lat_, const gps_position & long_,        const std::string & s1_, const std::string & s2_    ) :        bus_stop(lat_, long_), street1(s1_), street2(s2_)    {}};

Note how the derived class calls the serialization function of the base class.NeverDirectly call the base classserializeFunction. This seems to work, but it bypasses the code used to exclude redundant data and bypasses the version mechanism. For this reason, we recommend that you alwaysserializeSet as a private member function. Statement friend boost: serialization: Access ensures that the serialization library can access private members of the class and call private functions of the class.

Pointer

Suppose we now define an array of bus stops as a bus route. Considering:

  1. We may have several different bus stops (bus_stop as the base class)
  2. A specific bus_stop may not appear in more than one line.

It is natural to use an array of bus_stop pointers to represent a bus route.

class bus_route{    friend class boost::serialization::access;    bus_stop * stops[10];    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        int i;        for(i = 0; i < 10; ++i)            ar & stops[i];    }public:    bus_route(){}};

Each element of the stops array is serialized. But remember that every element of it is a pointer-what does this mean? The object pointed to by the pointer can survive in another address in the next serialization reconstruction. To complete the serialization of a pointer, it is not enough to store only the pointer value. The object to which it points must also be stored. When a pointer member is re-loaded, the object it points to must have been created, but the Pointer Points to the new object.

All of these are automatically completed by our database. The above code is all you need to do when you want to serialize a pointer correctly.

Array

Of course, the above example is still a little complicated, and we have a simpler method. Our serialization library automatically checks whether an object is an array. If so, the code equivalent to the above is generated. Therefore, the above Code can be simplified:

class bus_route{    friend class boost::serialization::access;    bus_stop * stops[10];    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        ar & stops;    }public:    bus_route(){}};
STL container

The preceding example uses a member array. Generally, a program uses an STL container to achieve the same purpose. Our serialization library includes all the code required to serialize STL classes. Therefore, the following code will work normally as you want.

#include <boost/serialization/list.hpp>class bus_route{    friend class boost::serialization::access;    std::list<bus_stop *> stops;    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        ar & stops;    }public:    bus_route(){}};
Versioning of Classes

Suppose we are very satisfied with bus_route now. We use it to construct a program and then release it. After a period of time, it is found that the program requires enhanced functionality, and bus_route must also be changed to include the driver's name. Our new version is as follows:

#include <boost/serialization/list.hpp>#include <boost/serialization/string.hpp>class bus_route{    friend class boost::serialization::access;    std::list<bus_stop *> stops;    std::string driver_name;    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        ar & driver_name;        ar & stops;    }public:    bus_route(){}};

Well, we did. But ...... What should those who use our previous version of the program do? They may have a large number of archive files constructed using previous programs. How can they continue to be used in our new programs?

Generally, our library stores a version number for each class to be serialized. The default version number is 0. When the file is loaded, the serialization function will get the file version number, which can be used for upward compatibility:

#include <boost/serialization/list.hpp>#include <boost/serialization/string.hpp>#include <boost/serialization/version.hpp>class bus_route{    friend class boost::serialization::access;    std::list<bus_stop *> stops;    std::string driver_name;    template<class Archive>    void serialize(Archive & ar, const unsigned int version)    {        // only save/load driver_name for newer archives        if(version > 0)            ar & driver_name;        ar & stops;    }public:    bus_route(){}};BOOST_CLASS_VERSION(bus_route, 1)

For versionized classes, you do not need to consider maintaining the file version. The file version is a set of versions of the classes contained in the file. Our library keeps the new version of the program compatible with the files constructed by the previous version of the program, and the work to be done is not much more complex than the above Code.

Set serializeUse save/loadSeparate implementation

serializeIt is concise and clear to ensure that the class members are stored and loaded in the correct order-this is the key to a serialization system. However, in some cases, the loading and storage operations are not necessarily similar to the above example. For example, this often happens when a class is evolved into multiple versions:

#include <boost/serialization/list.hpp>#include <boost/serialization/string.hpp>#include <boost/serialization/version.hpp>#include <boost/serialization/split_member.hpp>class bus_route{    friend class boost::serialization::access;    std::list<bus_stop *> stops;    std::string driver_name;    template<class Archive>    void save(Archive & ar, const unsigned int version) const    {        // note, version is always the latest when saving        ar  & driver_name;        ar  & stops;    }    template<class Archive>    void load(Archive & ar, const unsigned int version)    {        if(version > 0)            ar & driver_name;        ar  & stops;    }    BOOST_SERIALIZATION_SPLIT_MEMBER()public:    bus_route(){}};BOOST_CLASS_VERSION(bus_route, 1)

MacroBOOST_SERIALIZATION_SPLIT_MEMBER()Generated code that correctly calls save () and load.

Archives

The above discussion focuses on the ability to add serialization in the class. However, the real description of data is implemented in the archive class. Therefore, serialized data streams are products generated together with the archive types used. Separating the two is a key design. This allows specific serialization processes to be used for any file type.

In this guide, we only use one archive type-for storagetext_oarchiveAndtext_iarchive. The interfaces of other file types in the library are the same (only one exception ). Once the serialization function is defined for the class, the class can use any type of file for a specific serialization process.

If the provided archives cannot meet the needs of specific programs, you can also construct your own archives or derive from existing archives. This will be discussed in the subsequent manual.

Note that although our example is to use the same file in the same program, this is only for demonstration purposes. However, you can still use files generated by different programs.

The complete demo program demo. cpp does the following:

  1. Build a Data Structure with different stations, lines, and timelines.
  2. Display it.
  3. Serialize it to the file "testfile.txt"
  4. Reload data in another structure
  5. Displays the content of another data structure.

The program output proves that our system has met the 10 requirements mentioned in overview. The file content can be displayed in the form of a common ASCII file.

Copyright Robert Ramey 2002-2004. distributed under the boost software license, Version 1.0. (See accompanying file license_00000.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.