Recently, the need to transplant faster-rcnn detect parts to the Android platform, to facilitate the deletion of code and debugging, the need for cross-platform compatibility to run under Windows, Windows debugging, With the Linux model definition proto and training good binary model, but the load model has not been successful, step-by-step solution is as follows:
(1) Check the PROTOBUF version, are 2.5.0, it is not possible because of incompatible version;
(2) Check Cafe.proto, this file in Linux and Windows is the same, compiled caffe.pb.h and caffe.pb.h.cc almost exactly the same (except for individual 0 padding), so it is impossible to proto definition of the failure caused by different;
Helpless, can only debug the code in PROTOBUF, step-by-Step follow-up:
(3) Compile the Windows Protobuf Code (debug version), note that Protoc compile the update at the same time, and debug the Caffe model load to find the problem as follows:
The specific performance is: Read the model buffer, read only a very small part of the file termination, therefore, can not successfully initialize each layer parameters, the problem appears in the Open/read file operation function,
actual-read-size = open (File-descriptor, buf-ptr, need-size), where need-size default patch size = 8192, read actual-read-size to 82, Far from Reading finished, suspected to have encountered the Terminator;
Use UltraEdit to view the binary model as follows:
Note the Red Circle section, the 83rd character, 1 a corresponds to the ASCII as sub, and under Windows, Sub is a text file read Terminator, therefore, read model buffer prematurely terminated, so failed,
1 A corresponding decimal integer, protobuf in the Save and parse Caffe model ( depending on the caffe.pb.cc, different network corresponding binary model may appear 26, also may not appear 26, complexity is different ) , will insert tag as a separate range of identification, tag appears in 26, that is, under Windows sub-termination identity;
(4) Why a sub appears in binary.
In principle, are binary byte stream. This is the problem with Windows Open/read, and it is also different in how Windows and Linux process binary file box text files.
(a) Windows distinguishes between text files and binary files, which are not differentiated under Linux (all byte streams);
(b) The Open/read (System function) is different from the Fopen/fread (C language API), except for the buffering mechanism (where the buffering mechanism is not related to our problem),
Under Windows, fopen can specify the "RB" binary read mode to open, and read requires a joint o_rdonly| O_binary is used to read the binary model.
And in Linux, only o_rdonly, no o_binary, o_rdonly generally defined in the/usr/include/bits/fcntl.h------->fcntl-linux.h, even if there is o_binary, For open read binary models, o_rdonly and o_rdonly| O_binary is no different, fopen (file, "RB") and fopen (file, "R") are no different;
Then, there is a small bug in the Readprotofrombinaryfile function of the Caffe Src/util/io.cpp file, which makes the model of training under Linux loaded with Windows a platform-agnostic (narrowly defined),
The amendments are as follows:
#ifdef _WIN32
int fd = open (filename, o_rdonly| O_binary);
#else
int fd = open (filename, o_rdonly);
#endif
Of course, Caffe source code does not appear in o_binary, need to use the error occurred at the same time, think of this difference, and corresponding to the revision of the compiler (Caffe version of the department faster-rcnn Python version of the corresponding Caffe branch, Check to see if the latest branch of Caffe has corrected this so-called bug (custom bug ^_^).