Caffe Learning Series (2): Data layer and parameters

Source: Internet
Author: User
Tags random shuffle shuffle

To run Caffe, you need to first create a model, such as a more commonly used lenet,alex, and a model consisting of multiple houses (layers), each of which is composed of many parameters. All parameters are defined in the Caffe.proto file. To skillfully use Caffe, the most important thing is to learn to write the configuration file (Prototxt).

There are many types of layers, such as data,convolution,pooling, where data flow between layers is carried out in a blobs manner.

Today, let's introduce the data layer first.

The data layer is the lowest level of each model, is the portal of the model, not only provides input to the data, but also provides data from BLOBs to other formats for saving output. the usual data preprocessing (e.g. subtract mean, zoom in, cut and mirror, etc.) , the implementation of the parameters is also set in this layer.

data sources can come from an efficient database, such as Leveldb and Lmdb), can also be directly from the memory. If you don't focus on efficiency, the data can also come from disk hdf5 files and picture format files.

Common parameters for all data layers: First Look at the example

Layer {name:"Cifar"Type:"Data"Top:"Data"Top:"label"include {Phase:train} transform_param {mean_file:"Examples/cifar10/mean.binaryproto"} data_param {Source:"Examples/cifar10/cifar10_train_lmdb"batch_size:100Backend:lmdb}}

Name: This layer is named and can be arbitrarily taken

Type: The class of the layer, if it is data, that is derived from leveldb or Lmdb. Depending on the source of the data, the type of the data layer is different (explained later in detail). In practice, we are all using LEVELDB or lmdb data, so the layer type is set to data.

top or bottom: Each layer uses bottom to enter data, and top to output the data. If only top has no bottom, then this layer has only output and no input. Vice versa. If there are multiple top or multiple bottom, the inputs and outputs of multiple blobs data are represented.

Data and label: at least one of the top names in the database layer named data. If you have a second top, it is generally named label. This (Data,label) pairing is required for the classification model.

include : In general training and testing, the layers of the model are different. Whether the layer is a layer that belongs to the training stage or a layer that belongs to the test phase, it needs to be specified with an include. If there is no include parameter, it means that the layer is both in the training model and in the test model.

Transformations: Data preprocessing allows data to be transformed into defined ranges. If you set the scale to 0.00390625, it is actually 1/255, and the input data is normalized from 0-255 to 0-1.

Other data preprocessing is also set in this place:

Transform_param {    0.00390625    " Examples/cifar10/mean.binaryproto      #  Use a profile for the mean operation    mirror:1  #  1 means to turn on mirroring, 0 to close, You can also use ture and false to represent    #  trim a 227*227 tile, crop it randomly during the training phase, and cut from the middle in the test phase    crop_size:227   }

The Data_param part of the following is different settings depending on the source of the data.

1, data from the database (such as LEVELDB and Lmdb)

Tier type (layer type):D ATA

Parameters that must be set:

Source: The directory name that contains the database, such as Examples/mnist/mnist_train_lmdb

Batch_size: The number of data processed each time, such as 64

Optional Parameters:

Rand_skip: At the beginning, passing in the input of some data. This is often useful for asynchronous SGD.

Backend: Choose whether to use LEVELDB or Lmdb, the default is Leveldb.

Example:

Layer {name:"mnist"Type:"Data"Top:"Data"Top:"label"include {Phase:train} transform_param {scale:0.00390625} data_param {Source:"Examples/mnist/mnist_train_lmdb"batch_size:64Backend:lmdb}}

2, the data from the memory

Layer Type: Memorydata

Parameters that must be set:

Batch_size:每一次处理的数据个数,比如2

Channels:通道数

Height : Height

Width: Wide

Example:

Layer {top:"Data"Top:"label"Name:"Memory_data"Type:"Memorydata"memory_data_param{batch_size:2Height:100Width:100Channels:1} transform_param {scale:0.0078125Mean_file:"Mean.proto"Mirror:false}}

3. Data from HDF5

Layer Type: Hdf5data

Parameters that must be set:

Source: Read file name

Batch_size: Number of data processed each time

Example:

 layer {name:   " data    type:   " hdf5data  "   top:    "  top:  "     Hdf5_data_param {Source:   " examples/hdf5_classification/data/train.txt   "  Batch_size:  10 

4, the data from the picture

Layer Type: ImageData

Parameters that must be set:

Source: The name of a text file, each line given a picture file names and labels (label)

Batch_size: Number of data processed each time, that is, pictures

Optional Parameters:

Rand_skip: At the beginning, passing in the input of some data. This is often useful for asynchronous SGD.

Shuffle: Random shuffle Order, default value False

New_height,new_width: If set, the picture is resize

Example:

Layer {name:"Data"Type:"ImageData"Top:"Data"Top:"label"Transform_param {mirror:false crop_size:227Mean_file:"Data/ilsvrc12/imagenet_mean.binaryproto"} image_data_param {Source:"Examples/_temp/file_list.txt"batch_size:50New_height:256New_width:256  }}

5. Data from Windows

Layer Type: Windowdata

Parameters that must be set:

Source: The name of a text file

Batch_size: Number of data processed each time, that is, pictures

Example:

Layer {name:"Data"Type:"Windowdata"Top:"Data"Top:"label"include {Phase:train} transform_param {mirror:true crop_size:227Mean_file:"Data/ilsvrc12/imagenet_mean.binaryproto"} window_data_param {Source:"Examples/finetune_pascal_detection/window_file_2007_trainval.txt"batch_size:128Fg_threshold:0.5Bg_threshold:0.5fg_fraction:0.25Context_pad:16Crop_mode:"Warp"  }}

Caffe Learning Series (2): Data layer and parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.