Caffe is a framework for deep learning, written by C + + and Python, and the bottom is C + + source.
First, Caffe-master source code large framework:
The key documents are as follows:
-Data: Used to store the raw information (images, etc.) required for a program in Caffe-master
-Docs: For storing help documents
-Examples: for storing code
-Include/caffe: For storing header files. HPP (Very important. )
-matlab: For storing MATLAB interface file
-Python: For storing Python interface files
-Scripts: A script file used to store the running program (shell, SH)
-SRC: For storing Caffe source code. CPP/.CU (Very important. )
-Tools: Used to hold some binaries for invoking helper programs that can be run directly
The most important core code of the above 4&8 is: layers: Various layers of the defined. cpp/.cu/.hpp file (The following details how to define a new layer structure) solvers: All kinds of optimization methods SGD, Adam, etc. (follow-up will be detailed analysis optimization implementation of specific processes) Test: Testing Caffe Code
Proto:protobuf
Second, the four components of Caffe:
1. Blob: Represents the data in the network
Header file: stored in CAFFE/INCLUDE/CAFFE/BLOB.HPP
Where protected data members are defined:
Protected:
shared_ptr<syncedmemory> Data_;
Shared_ptr<syncedmemory> diff_;
Shared_ptr<syncedmemory> Shape_data_;
Vector<int> Shape_;
int count_;
int capacity_;
Where the BLOB data (Data&diff) access method is defined:
Const dtype* Cpu_data () const;
Const int* Gpu_shape () const;
Const dtype* Cpu_diff () const;
Const dtype* Gpu_diff () const;
dtype* Mutable_cpu_data ();
dtype* Mutable_gpu_data ();
dtype* Mutable_cpu_diff ();
dtype* Mutable_gpu_diff ();
Source code: stored in Caffe/src/caffe/blob.cpp, its main member functions are:
void Blob<dtype>::reshape (...)
void Blob<dtype>::reshapelike (const blob<dtype>& Other)
void Blob<dtype>::update () //Update weights function: data_=data_-diff_
void Blob<dtype>::sharedata (const blob& Other)
void blob< Dtype>::sharediff (const blob& Other)
2, Net: Connection layers, the entire network representation
3, Layer: To the neural network of the various layers of abstraction (the following will be explained in detail to define a new layer)
4, Solver: Defines the neural network model of the solution (default SGD, follow-up will explain Solver is how to complete the weight update) caffe/src/caffe/solver.cpp
In the member function solve () of the Solver class, you actually call another member function of Solver step ():
void Solver<dtype>::solve (const char* resume_file) {
CHECK (Caffe::root_solver ());
...
Step (Param_.max_iter ()-iter_);
...
}
Step ():
Template <typename dtype>
void solver<dtype>::step (int iters) {
const int start_iter = Iter_;
const int stop_iter = Iter_ + iters;
int average_loss = This->param_.average_loss ();
Losses_.clear ();
Smoothed_loss_ = 0;
Iteration_timer_. Start ();
...
Accumulate the loss and gradient
dtype loss = 0;
for (int i = 0; i < param_.iter_size (); ++i) {
loss + = Net_->forwardbackward ();//net () function in the forward class, for passing LOSS
}
loss/= param_.iter_size ();
Average the loss across iterations for smoothed reporting Updatesmoothedloss
(loss, Start_iter, Average_loss); On average let loss be smoother
...
Applyupdate ()//important. A pure virtual function of the Solver class, which requires a derived class to implement, so that it is defined in detail in different concrete **_solver.cpp
Then take SGD as an example to see Applyupdate ():
-CAFFE/INCLUDE/CAFFE/SOLVERS/SGD_SOLVER.HPP
-Caffe/src/caffe/solvers/sgd_solver.cpp (Caffe/src/caffe/solvers/sgd_solver.cpp)
Template <typename dtype>
void Sgdsolver<dtype>::applyupdate () {
Dtype rate = getlearningrate (); /Get the Learning_rate value of the current iteration
if (This->param_.display () && this->iter_% this->param_.display () = = 0) {
log_if (INFO, Caffe::root_solver ()) << "Iteration" << this->iter_
<< ", lr =" << rat e;
}
Clipgradients ()//Avoid gradient cutting by gradient cut
//action on all parameters that need to be updated in the network: for
(int param_id = 0; param_id < this->net_-> Learnable_params (). Size ();
++param_id) {
normalize (param_id);/regularization: Related to the issue of feature scaling
(regularize); Regularization: To avoid overfitting when training machine learning algorithm Computeupdatevalue
(param_id, rate); Compute new Weights update gradient diff_
}
this->net_->update ()//Complete update operation: Data_=data_-diff_
}
Among them: Getlearningrate (), clipgradients (), normalize (), regularize () are defined in this. cpp file, and interested readers can read them themselves.