C + + Serialization method

Last Update:2014-10-01 Source: Internet

Author: User

Tags normalizer python script

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Using boost serialization for the time being, my test is basically OK just for c++11 shared ptr not tested successfully, can only manually write down the shared PTR part of the serialization, that is, I do not directly serialize the pointer, self-management, such as the following look
Load_ (Modelfile); Model Direct serialization
String normalizername = Read_file (Obj_name_path (_normalizer));
if (!normalizername.empty ())
{//Because there is no direct serialization using shared PTR, I do not know the specific information, so I wrote the Normalzier type name to the text when I save, and the load is determined by this type
_normalizer = Normalizerfactory::createnormalizer (Normalizername, Obj_path (_normalizer));
}
String calibratorname = Read_file (Obj_name_path (_calibrator));
if (!calibratorname.empty ())
{
_calibrator = Calibratorfactory::createcalibrator (Calibratorname, Obj_path (_calibrator));
}
Static Normalizerptr Createnormalizer (string name)
{
Boost::to_lower (name);
if (name = = "Minmax" | | name = = "Minmaxnormalizer")
{
return make_shared<minmaxnormalizer> ();
}
if (name = = "Gaussian" | | name = = "Gaussiannormalizer")
{
return make_shared<gaussiannormalizer> ();
}
if (name = = "Bin" | | name = = "Binnormalizer")
{
return make_shared<binnormalizer> ();
}
LOG (WARNING) << name << "Isn't supported now, does not use Normalzier, return nullptr";
return nullptr;
}
??
Static Normalizerptr Createnormalizer (string name, string path)
{
Normalizerptr normalizer = Createnormalizer (name);
if (normalizer! = nullptr)
{
Normalizer->load (path); Direct serialization of Normalzier
}
return normalizer;
}
??
@TODO confirm if there is no way to serialize the shared PTR directly,
In addition, you can try the open source dedicated serialization library Creal, Creal emulate boost serialization while boost serialization only supports binary, text, XML three serialization, text serialization readability is not strong, binary speed is the fastest, The XML is more readable at a slower rate. I usually only use binary and XML format. The Creal supports the JSON-formatted output , which is known to support shared PTR
??
Speed of the same model boost serialization

?	binary	text
save	1.8	2.29
Load	1.9	2.67

if XML output is required, the serialization of boost is not the same as the binary output, and it is recommended that the XML output-supported notation be compatible with each other.
??
Friend class Boost::serialization::access;
Template<class archive>
void Serialize (Archive &ar, const unsigned int version)
{
/*???????? AR & Boost::serialization::base_object<predictor> (*this);
AR & _weights;
AR & _bias;*///This notation only supports binary
AR & BOOST_SERIALIZATION_BASE_OBJECT_NVP (Predictor);
AR & BOOST_SERIALIZATION_NVP (_weights); Such a macro is convenient if you need to change the name, such as _weights->weights can use the original function
AR & BOOST_SERIALIZATION_NVP (_bias);
}
??
??
The code for the serialized part is automatically generated using a Python script. Because it is not the same as C #, C # is serializable by default, if you do not need serialization, you can like a # define specified, and the boost default is not serialized, need to serialize the place need to display is written on the

predictors]$ get-lines.py LINEARPREDICTOR.H 98 99 | gen-boost-seralize-xml.py

??

friend class boost::serialization::access;

template<class archive>

void Serialize (Archive &ar, const unsigned int version)

{

ar & BOOST_SERIALIZATION_NVP (_weights);

ar & BOOST_SERIALIZATION_NVP (_bias);

??

4. For predictor default is the save binary, the optional SaveXML mode this automatic support, optional savetext This is a specific precitor subtype if there is a need to manually write the text output format.

The XML output is similar to this

Convert to JSON

xml2json.py model.xml > Model.json

More Model.json

Use JSON pretty print to view JSON files

jpp.py Model.json | More

xml2tojson.py using Xmltodict to convert to JSON

Import Sys,os

Import Xmltodict, JSON

Doc = xmltodict.parse (open (sys.argv[1)), process_namespaces=true)

Print Json.dumps (DOC)

jpp.py

Import Sys,os

Import JSON

s = open (Sys.argv[1]). ReadLine (). Decode (' GBK ')

Print Json.dumps (Json.loads (s), Sort_keys=true, indent=4, Ensure_ascii=false). Encode (' GBK ')

How can I view the model of the output more easily?
Small model output directly look at the XML text is good, if the data compare more processing XML is not very convenient, json better with Python,
But it's not very convenient to convert to a JSON map because you have to follow the key to access the string type is not automatically prompted
??
In [6]: Import JSON
??
In [7]: M = json.loads (Open ('./model.json '). ReadLine ())
??
In [8]: M.keys ()
OUT[8]: [u ' boost_serialization ']
??
In [9]: m[' boost_serialization '].keys ()
OUT[9]: [u ' @version ', U ' @signature ', U ' data ']
??
In []: m[' boost_serialization ' [' Data '] [' _trees '] [' Item '][0].keys ()
OUT[18]:
[u ' _gainpvalue ',
U ' @tracking_level ',
U ' @class_id ',
U ' _ltechild ',
U ' _gtchild ',
U ' _maxoutput ',
U ' _leafvalue ',
U ' numleaves ',
U ' _splitgain ',
U ' _splitfeature ',
U ' _previousleafvalue ',
U ' _threshold ',
U ' @version ',
U ' _weight ')
??
in [+]: m[' boost_serialization ' [' Data '] [' _trees '] [' Item '][0][' _splitgain '] [' Item '][10]
OUT[19]: U ' 3.89894126598927926e+00 '
??
Since the python prompt is not prompted as private by default, it has been modified
#include "conf_util.h"
#include <boost/serialization/nvp.hpp>
#define GEZI_SERIALIZATION_NVP (name)
BOOST::SERIALIZATION::MAKE_NVP (Gezi::conf_trim (#name). C_STR (), name)
??
This shows that Gainpvalue is not the beginning of _
??
Using Python's introspection function, dict data can be parsed by JSON, and string as key is converted into a Python object for easy access to the following
def H2O (x):
If Isinstance (x, Dict):
return type (' Jo ', (), {K:H2O (v) for K, V in X.iteritems ()})
Elif isinstance (x, List):
L = [H2O (item) for item in X]
return L
Else
return x
??
def H2O2 (x):
If Isinstance (x, Dict):
return type (' Jo ', (), {K:h2o2 (v) for K, V in X.iteritems ()})
Elif isinstance (x, List):
return type (' Jo ', (), {"I" + str (IDX): H2O2 (val) for Idx, Val in enumerate (x)})
return L
Else
return x
??
def xmlfile2obj (path):
Import Xmltodict
Doc = xmltodict.parse (open (path), process_namespaces=true)
Return H2O (DOC)
??
def xmlfile2obj2 (path):
Import Xmltodict
Doc = xmltodict.parse (open (path), process_namespaces=true)
Return H2O2 (DOC)
??
This can be done directly with the serialized XML file using M = xmlfile2obj (' *.xml ') or M = Xml2obj2 (' *.xml ')
The recommendation is to use the first, which is the standard conversion, to provide the second interface is primarily Python's automatic hint for list item no, only Dir () to view:
The second one turns [3] this way. I3 that is, all lists are removed with Dict.
??
m = Xmlfile2obj ('./model.xml ')
In []: m.boost_serialization.data.trees.item[0].splitgain.item[13]
OUT[14]: U ' 3.26213753939964946e+00 '
??
m = Xmlfile2obj2 ('./model.xml ')
in [+]: m.boost_serialization.data.trees.item.i0.splitgain.item.i13
OUT[16]: U ' 3.26213753939964946e+00 '
??
??
??
??
??

C + + Serialization method

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More