Caffe's project architecture and source code analysis

Last Update:2016-07-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Caffe is a deep learning framework based on the c ++/cuda language. developers can use it to freely build the desired network. Currently, convolutional Neural Networks and fully-Connected Neural Networks (Artificial Neural Networks) are supported ). On Linux, c ++ can operate interfaces through command lines. matlab and python have special interfaces, and Computing supports gpu and cpu. The current version supports multiple GPUs, however, the distributed multi-host version is still under development. A large number of researchers are using the caffe architecture and have also achieved many effective results. In September to December 2013, Jia yangqing developed the initial caffe version when preparing his graduation thesis at Berkeley University. Later, some other cool people joined the project and made continuous optimization over the past two years, it has become the most popular deep learning framework. Recently, caffe2 is also open-source, but it is still under development. This article mainly explains caffe at the source code level, and provides several things that interest you in the testing process.

Install the OpenCV3.1 + Caffe software in the open2014a + Anaconda2 + OpenCV3.1 + Ubuntu 16.04

How to configure Caffe in CUDA7.5 in Ubuntu 16.04

Caffe installation in 64-bit Ubuntu 14.04

Caffe + Ubuntu 14.04 64bit + CUDA 6.5 configuration instructions

1. How to debug

In order to be able to DEBUG, you must first set the debug option to 1 in the configuration file of makefile. This step selects carefully. The DEBUG version will output a large amount of time consumption for each stage during output, or directly from the caffe of the entire project. cpp to view the source file. After the debuggable version is compiled, execute the following command to start debugging.

gdb --args ./build/tools/caffe train –solver=examples/cifar10/cifar10_full_solver.prototxt

During the debugging process, you need to note that the function pointer is used in the source code, so it is easy to skip the next step. Therefore, you need to use s to enter the function at the right time.

2. Third-party Libraries Gflags

Gflags is a tool developed by Google to simplify the processing of command line parameters. It defines the actual meaning in the c ++ code and transmits the parameters in the command line. For example, in the following example, the c ++ Code declares such content. DEFINE_string is a string type, and the solver in the brackets is a flag, the parameter read by the flag from the command line will be parsed to a string, which exists in FLAGS_solver and can be used as a normal string. When calling the command line (see the example in the debugging Section), use-solver = xxxxx to pass the actual value to it. Here, the string can be replaced with int32, int64, or bool.

DEFINE_string (solver, "", "The solver definition protocol buffer text file .");

Note that this definition process can only be defined once in one file. If you want to use other files, you can choose either, or directly declare in the required file, one way is to declare in a header file, and include other files directly. The statement is as follows:

DECLARE_bool (solver );

If you need to set the bool variable to false, a simple method is to add no before the variable, that is, to-nosolver. In addition,-will cause the resolution to stop. For example, in the following formula, f1 is flag and its value is 1, but f2 is not 2.

Foo-f1 1--f2 2

Protobuf

Google Protocol Buffer (Protobuf for short) is a standard for Google's internal hybrid language data. Currently, more than 48,162 types of message formats and more than 12,183. proto files are being used. It is a lightweight and efficient structured data storage format. It can be used to serialize and serialize structured data, and define how data is structured at a time, currently, APIs in c ++, java, and python are provided. Compared with xml, xml has the advantages of simplicity, small size, fast reading and processing time, less ambiguity, and easy to generate programming classes.

For caffe, the utility of this tool is embodied in the parameter classes required to generate caffe. These classes can parse parameters from files ending with. prototxt, and then generate parameters for Net and Layer. Customize the serialization file. proto, File Content 1, defines a class with the keyword message. In this figure, it is a parameter class of the convolution layer. The member types of this class include bool and uint32, you can also customize the type. The number after the equal sign is a unique number tag to distinguish these different parameters. in the official documentation, these are called fields. Multiple messages can be defined in one. proto. The annotation style is the same as that in c/c ++. After the defined proto is compiled, the header file and source file corresponding to c ++ for. h and. cc are generated.

Figure 1 custom proto

Figure 2 shows the automatically generated file after compilation. The ConvolutionalParameter class is generated.

Figure 2 automatically generated c ++ class

Glob

This tool is also developed by Google to print initialization and runtime information and record unexpected interruptions. Before using logging, you must initialize the google logging library. LOG (INFO)…, which is common in caffe )... And CHECK (XXX )... All of them are executed. For more information, see the following figure. Figure 3 shows how to print the color in the code, and figure 4 shows the information printed on the screen.

Figure 3 c ++ code

Figure 4 print information

LMDB

Lmdb is a fast and lightweight database that supports multithreading and multi-process concurrency. data is stored by key-value pairs. Caffe also provides the leveldb interface. This article only discusses lmdb implemented by python. The database stores serialized strings. Caffe provides data in lmdb format as a script file. This script file generates a folder containing two files, one data file and one lock file. Then, call the DataLayer layer of the Training Network to read data in lmdb format. Figure 5 defines the ldmb database type, and figure 6 serializes the data and then stores it in the database.

Figure 5 db Definition

Figure 6 db Storage

3. Basic Structure of caffe Blob

This is the data storage class blob of caffe, which implements all related information and operations on a variable. The data storage method can be viewed as an n-dimensional c array with continuous storage space. For example, if the image is stored in 4 dimensions (num, channel, height, width), the variable (n, k, h, w) is stored in the array at (n * K + k) * H + h) * W + w. The corresponding four-dimensional parameters are saved as (out_channel, in_channel, filter_size, filter_size ). Blob has the following three features:

Two pieces of data, one is the original data, and the other is the derivative value diff.
Two memory allocation methods: one is allocated to the cpu, and the other is allocated to the gpu, which is distinguished by prefix cpu and gpu.
There are two access methods: data cannot be changed, and data can be changed.

The data and diff designs make everyone shine. In fact, in a convolutional network, in many cases, the next variable not only has its own value, but also has the derivative of cost function, using too many variables to save the two information is not as intuitive as putting them together. It is the source code blob. hpp definition.

Figure 7 define blob

Layer

Caffe encapsulates them into different layers based on different functions, such as convolution, pooling, nonlinear transformation, and data Layer. For details about the number of layers and their content, refer to the official document. This article mainly discusses its implementation. Its implementation is divided into three parts. You can also refer to the demo Figure 8:

Setup, initialize each layer, and its connection relationship
Forward, top by bottom
Backward: Calculate the bottom gradient from the top gradient, and obtain the parameter gradient with parameters.

Figure 8 caffe layer implementation

However, functions of Forward Propagation and backward propagation can be implemented in two ways: gpu-based and cpu-based. Forward function. The parameters are two vectors that store blob pointers, bottom and top. The pointer array can be used to implement multiple input and output. It is worth mentioning that the convolutional part of caffe converts data into a matrix and then uses matrix multiplication to implement convolution. cudnn also adopts this method. After my experiments, indeed, this method is faster than directly implementing cuda kernel. Most of the underlying implementation of caffe is implemented using blas or cublas.

Net

Net, which correctly connects different layers and is a set of connections between layers. Net: Init () is used to initialize the model, construct blobs and layers, and call the setup function of layers. Net's Forward function calls ForwardPrefilled internally and calls ForwardFromTo. It calls the Forward function of the Layer object from the specified Layer id (start) to the end.

Solver

Solver is the key to controlling the network, its specific functions include parsing and passing prototxt, executing train, calling the network Forward Propagation computation output and loss, backward propagation computation gradient, and updating parameters based on different optimization methods (there may be more than learning rate, but an update method consisting of alpha and beta. In parsing. during prototxt, The NetParameter object is initialized to place all network parameters. When initializing the training network, the proto file address provided by the net variable is used, to parse and obtain the hierarchical parameters of the network. The solve function parses and restores the previously saved network files and weights based on the parameters passed in by the command line, and restores the iteration times and loss that were last executed. After the network parameters are configured, the Forward function of net. cpp is called to execute the network after the file to be restored is processed. Forward returns the loss of this iteration and prints it out. Next, we will call the ApplyUpdata function, which will change the learning rate of the current weight based on different policies, and then update the weight. In addition, solver also provides the snapshot saving function.

4. Run the instance

For example, you can use the following method to classify your image data. I use all c ++, and it is easier to change the source code.

Organize the image into two folders, train and test, and save the image name and label to a txt file.
Convert data to lmdb format using convert_imagenet
Generate the image after the mean processing, and use the compute_image_mean tool.
Modify the model and run train.

In addition, I have tested one-dimensional data and modified the convert_imagenet.cpp source code to read the data into lmdb. The rough code is as follows:

datum.set_channels(num_channels);datum.set_height(num_height);datum.set_width(num_width);datum.clear_data();datum.set_encoded(false);datum.set_data(lines[line_id].first);datum.set_label(lines[line_id].second);

By modifying this code, you can read one-dimensional data into the network for processing. In addition, an error occurs during the execution of this one-dimensional data. The error message "Too big key/data, key is empty, or wrong DUPFIXED size" is returned ", this problem occurs because lmdb stores key-value pairs, while lmdb limits the key length. The length cannot exceed 512, but when I transfer more key values, therefore, the problem is solved.

5. Summary

By reading the source code, we can see that caffe, as an architecture, has clear layers, ideas, and problems to be solved. Its efficiency is reflected in many aspects, not only reading the fast lmdb, in addition, the computation is basically completed using a very efficient blas database. Its data, layers, and network composition and execution are controlled separately, which provides great flexibility. The only regret is that the installation is cumbersome, A dependency package is always not installed. In general, caffe is widely used in scientific research. A lot of research is based on the caffe pre-trained imagenet network and has made great progress. The spirit of sharing is worthy of recognition.

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Caffe's project architecture and source code analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Caffe's project architecture and source code analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support