Caffe Basic Introduction

Last Update:2015-10-31 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The full name of Caffe should be convolutional Architecture for Fast Feature embedding, which is a clear and efficient deep learning framework, which is open source, the core language is C + +, It supports command line, Python, and MATLAB interfaces, which can be run on the CPU or on the GPU. Its license is the BSD 2-clause.

One reason why deep learning is popular is that it is able to learn from data autonomously to useful feature. Especially for some situations where you don't know how to design feature, such as images and speech.

Caffe design: Basically, Caffe follow a simple assumption of neural networks---- All calculations are expressed as layers, and what the layer does is take some data , and then output some of the results after the calculation , such as convolution, is to enter an image, and then with this layer of parameters (filter) convolution, and then output convolution results. Each layer needs to do two calculations: forward is the output from the input calculation, and then backward is from the above gradient to calculate the relative to the input gradient, as long as these two functions implemented, we can connect many layers into a network, The thing that this network does is to enter our data (image or voice or whatever), and then to calculate the output we need (for example, the identified label), in training, we can calculate loss and gradient according to the existing label, Then use gradient to update the parameters of the network, this is a basic process of Caffe.

Basically, the simplest way to get started with Caffe is to first write the data into a caffe format, then design a network, and then use Caffe to provide the solver to do the optimization to see how the effect , if your data is an image, you can from the existing network, For example Alexnet or googlenet start, and then do fine tuning, if your data slightly different, for example, direct float vector, you may need to do some custom configuration, Caffe's Logistic regression example might be helpful.

Fine Tune method: The idea of Fine tuning is that the imagenet of a large data set train a very good network, the other task must also be good, So we can take the Pretrain network, and then just re-train the last few layers, re-train means that, for example, I used to classify imagenet 1000 classes, now I just want to identify whether it is a dog or a cat, or is not a license plate, So I can turn the last layer of Softmax from a 4096*1000 classifier into a 4096*2 classifier, and this strategy is very useful in the application, so we often first imagenet a network on Pretrain, Because we know how the imagenet on the training.

Caffe can be used in vision, speech recognition, robotics, neuroscience and astronomy.

Caffe provides a complete toolkit to train, test, fine-tune, and deploy models.

Highlights of Caffe:

(1), Modular: Caffe is designed to be as modular as possible from the outset, allowing the expansion of new data formats, network layers, and loss functions.

(2), representation and implementation separation: The model definition of Caffe is written into the configuration file in the protocol buffer language. The Caffe supports the network architecture in the form of an arbitrary, non-circular graph. Caffe will properly consume memory based on the needs of the network. A function call is implemented to switch between the CPU and the GPU.

(3), test coverage: In Caffe, each single module corresponds to a test.

(4), Python and MATLAB interfaces: Both Python and MATLAB interfaces are available.

(5), pre-training Reference Model: For visual projects, Caffe provides a number of reference models that apply only to academic and non-commercial areas, and their license are not BSD.

Caffe Architecture :

(1), data storage: Caffe The data is stored and passed as a 4-dimensional array through "blobs". BLOBs provides a unified memory interface for the operation of batch images (or other data), parameter or parameter updates. Models is stored on disk in a way that Google Protocol buffers. Large data is stored in the LEVELDB database.

(2), layer: A caffe layer is the essence of a neural network layer that takes one or more blobs as input and produces one or more blobs as outputs. The network as a whole operation, the layer has two key responsibilities: forward propagation, need to input and produce output, reverse propagation, take the gradient as output, through the parameters and input to calculate the gradient. Caffe provides a complete set of layer types.

(3), network and operation mode: Caffe retains all the non-circular layer diagram to ensure the correct forward propagation and reverse propagation. The Caffe model is an end-to-end machine learning system. A typical network begins at the data layer and ends at the loss layer . With a single switch, the network runs on the CPU or GPU. On the CPU or GPU, the layer produces the same result.

(4), training a network: Caffe training a model by fast, standard random gradient descent algorithm.

In Caffe, fine tuning (Fine tuning) is a standard method that adapts to existing models, new schemas, or data. For new tasks, Caffe fine-tune the old model weights and initialize the new weights as needed.

Blobs,layers,and Nets: The composition mode of a deep network represents the collection of internal connection layers that work for a block of data. In its own model mode, Caffe defines a layer (layer-by-layer) network. The Caffe network defines the entire model from the low-end to the top-level, from input data to the loss layer. With the forward propagation and reverse propagation of data through the network, Caffe storage, communication and information operations as blobs. BLOBs are standard arrays and unified memory interface frameworks. BLOBs are used to store data, parameters, and loss. The resulting layer is the basis of model and calculation, and it is the basic unit of the network. NET as a layer of the connection and collection, network construction. The BLOB describes in detail how layer and layer or net are storing and communicating information. Solver is the solution of net.

Blob storage and Transport: a BLOB is an encapsulation of the actual data to be processed, which is passed through the Caffe. BLOBs also provide synchronization capabilities between the CPU and the GPU. In mathematics, blobs are storage arrays of contiguous n-dimensional arrays.

Caffe stores and transmits data through blobs. BLOBs provides a unified memory interface for storing data, such as batch images, model parameters, and derivative optimizations.

BLOBs hides the overhead of synchronizing compute and hybrid CPU/GPU as needed from the host CPU to the device GPU. The memory of the host and device is on demand.

For batch image data, the BLOB regular capacity is the number of images n the number of channels k* image High h* image width W. On a layout, blob storage is dominated by rows, so the last/rightmost dimension changes fastest. For example, in a 4D blob, the index (n, K, H, W) of the value of the physical position index is ((n * k + k) * H + H) * W + W. For non-image applications, the use of BLOBs is also effective, such as with 2D blobs.

The parameter blob size varies depending on the type and configuration of the current layer.

A blob stores two pieces of memory, data and diff, which are normal data for forward propagation, which is the gradient computed over the network.

A blob uses the Syncedmem class to synchronize the values between the CPU and the GPU in order to hide the details of the synchronization and minimize the data transfer.

Layer calculations and joins: Layer is the essence of the model and the basic unit of calculation. Layer convolution filter, pool, take inner product, apply nonlinearity, sigmoid and other element conversion, normalization, load data, calculate losses.

Each layer type defines three critical calculations: set, forward, and reverse. (1), Settings: Initialize the layer and connect once on model initialization, (2), forward: from the bottom for the given input data calculation output and transfer to the top; (3), reverse: For a given gradient, the top output calculates the gradient to the input and transmits to the low end.

There are two forward (forward) and reverse (backward) functions, one for the CPU and one for the GPU.

The definition of a Caffe layer consists of two parts, a layer attribute, and a layer parameter.

Each layer has input some ' bottom ' blobs, outputting some ' top ' blobs.

NET definition and operation: NET consists of composition and differentiation together define a function and its gradient. Each layer outputs a calculation function to complete a given task, and each layer reverses the gradient from the learning task by loss. The Caffe model is an end-to-end machine learning engine.

NET is a directed acyclic graph (DAG) composed of layers. A typical net begins at the data layer, which loads data from disk and terminates at the loss layer, which computes the target task, such as classification and rebuild.

Model initialization is handled by Net::init (). Initialization mainly does two things: build the entire DAG by creating blobs and layers, and call the Setup () function of layers. It also does a series of other bookkeeping (bookkeeping) things, such as verifying the correctness of the entire network architecture.

Model format: Themodels is defined in plaintext protocol buffer schema (Prototxt) while the learned models is serialized a S binary protocol buffer (Binaryproto). caffemodel files. The model format is defined by the Protobufschema in Caffe.proto.

Forward and backward: Forward inference, backward learning.

Solver optimizes a model by first calling forward to get the output and loss, then calling backward to generate the gradient of the model, then merging gradients to weights (weight) updates to minimize loss. The division of labor between Solver, net and layer enables Caffe to remain modular and open to development.

Loss: In Caffe, as most machine learning, learning (learning) is driven by the Loss function (error, cost, or objective function). A loss function specifies the target of learning by mapping the parameter settings (for example, the current network weights) to a scalar value. Therefore, the goal of learning is to find the setting that minimizes the loss function weights.

In Caffe, loss is computed by the forward of the network. Each layer takes a set of input blobs (bottom, which represents input) and produces a set of output blobs (top, which represents the output). Some of the layer's output may be used in the loss function. For categorical tasks, a typical loss function selection is the Softmaxwithloss function.

Loss Weights:net produces a loss,loss weights through many layers that can be used to specify their relative importance.

By convention, the Caffe layer type with the "loss" suffix is applied to the loss function, but the other layers are assumed to be pure shredding for intermediate computations. However, any layer can be used for loss by adding a "loss_weight" field to a layer definition.

In Caffe, the final loss is calculated through all the weighted loss plus and through the network.

Solver: Solver attempts to improve the loss to model optimization by coordinating the forward inference and the back gradient of the network to form parameter updates. Learning's role is to be divided into Solver supervised optimization and generating parameter updates, net generating loss and gradients.

Caffe Solver method: Random gradient descent (Stochastic Gradient descent, type: "SGD"), Adadelta (type: "Adadelta"), adaptive gradient ( Adaptive gradient,type: "Adagrad"), Adam (type: "Adam"), Nesterov ' s Accelerated Gradient (type: "Nesterov"); Rmsprop (type : "Rmsprop").

Solver Effect: Solver is the solution of net. (1), optimize bookkeeping, create learning Training Network, evaluate the network, (2), call Forward/backward iterative optimization and update parameters, (3), periodically evaluate the test network, (4), optimize the snapshot model and Solver status.

each iteration of the Solver executes: (1), calls the network forward calculates the output and the loss, (2), calls the network backward calculates the gradient, (3), follows the Solver method, uses the gradient to carry on the parameter update; (4), Update Solver status by learning rate, history, and method. All weights from initialization to learned model are obtained through the above execution.

Like the Caffe Models,caffe solvers can also run in CPU or GPU mode.

The Solver method deals with the overall optimization problem of minimizing loss.

The actual weight update is generated by the solver and then applied to the net parameter.

Layer Catalogue: In order to create a Caffe model, you need to define the model schema in a Prototxt file (protocol buffer definition file). Caffe layers and their parameters are defined in the protocol buffer definitions file, for Caffe engineering is Caffe.proto.

The Vision layers:vision Layers usually takes an image as input and produces other images as output:

(1), convolution (convolution): convolution layer by the input image and a series of learning filtering convolution, in the output image, each produces a feature map, (2), Pooling (Pooling), (3), Local Response Normalization (LRN), (4), Im2col.

Loss Layers:loss Drive Learning compares one output to one target and assigns cost to minimize. The loss itself is computed through the forward transmission, the gradient to the loss is computed through the rear-to-back transmission:

(1), Softmax (Softmaxwithloss), (2), Sum-of-squares/euclidean (Euclideanloss), (3), Hinge/margin (Hingeloss), (4), Sigmoidcross-entropy (Sigmoidcrossentropyloss), (5), Infogain (Infogainloss), (6), accuracy andtop-k.

Activation/neuronlayers: General Activation/neuron layers is an element-wise operation, enter a bottom blob and produce a top blob of the same size:

(1), Relu/rectified-linearand Leaky-relu (ReLU), (2), Sigmoid (Sigmoid), (3), Tanh/hyperbolic Tangent (TanH), (4), Absolute Value (Absval), (5), Power (Power), (6), BNLL (BNLL).

Data Layers: Input caffe through the Layers, they are at the low end of the network. Data can come from: an efficient database (LEVELDB or Lmdb), directly from memory, in the case of no efficiency, or from a file, on disk HDF5 data format or in a normal image format:

(1), Database (Data), (2), In-memory (Memorydata), (3), Hdf5input (Hdf5data), (4), HDF5 Output (Hdf5output), (5), Images ( ImageData), (6), Windows (Windowdata), (7), Dummy (Dummydata).

Common Layers: (1), innerproduct (Innerproduct), (2), splitting (Split), (3), flattening (Flatten), (4), reshape (reshape); (5), concatenation (Concat), (6), slicing (Slice), (7), Elementwise Operations (eltwise), (8), Argmax (Argmax), (9), Softmax (Softmax); (10), Mean-variancenormalization (MVN).

Data: In Caffe, it is stored in BLOBs. The Data layers loads the input and saves the output by converting from BLOB to other format. Common transformations like mean-subtraction and feature-scaling are done by configuring the data layer. The new input type needs to be developed with a new data layer to support.

The above content comes from the Caffe official website translation and some network blog's collation, the main reference:

1. "Caffe:convolutional Architecture for Fast Feature embedding"

2. http://caffe.berkeleyvision.org/tutorial/

3. HTTP://SUANFAZU.COM/T/CAFFE/281/3

4. http://mp.weixin.qq.com/s?__biz=MzAxNTE2MjcxNw==&mid=206508839&idx=1&sn= 4dea40d781716da2f56d93fe23c158ab#rd

5. https://yufeigan.github.io/

Caffe Basic Introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Caffe Basic Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Caffe Basic Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support