Caffe Code Guide (4): Data Set preparation
Caffe There are two simple examples: Mnist and CIFAR-10, the former is used for handwritten numeral recognition, the latter for small image classification. These two datasets can be downloaded in the CAFFE source framework using scripts (caffe_root/data/mnist/get_mnist.sh and caffe_root/data/cifar10/get_cifar10.sh), as shown in:
[Plain]View Plaincopyprint?
- $./get_cifar10.sh
- Downloading ...
- --2014-12-02 01:20:12--http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
- Resolving www.cs.toronto.edu ... 128.100.3.30
- Connecting to www.cs.toronto.edu|128.100.3.30|:80 ... Connected.
- HTTP request sent, awaiting response ... OK
- length:170052171 (162M) [Application/x-gzip]
- Saving to: "Cifar-10-binary.tar.gz"
- 100%[========================================================================================================== =================================================>] 170,052,171 859k/s in 2m 16s
- 2014-12-02 01:22:28 (1.20 mb/s)-"cifar-10-binary.tar.gz" saved [170052171/170052171]
- Unzipping ...
- Done.
- $ ls
- Batches.meta.txt data_batch_1.bin data_batch_2.bin data_batch_3.bin data_batch_4.bin data_batch_5.bin get_cifar10.sh Readme.html Test_batch.bin
[Plain]View Plaincopyprint?
- $./get_mnist.sh
- Downloading ...
- --2014-12-02 01:24:25--http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
- Resolving yann.lecun.com ... 128.122.47.89
- Connecting to yann.lecun.com|128.122.47.89|:80 ... Connected.
- HTTP request sent, awaiting response ... OK
- length:9912422 (9.5M) [Application/x-gzip]
- Saving to: "Train-images-idx3-ubyte.gz"
- 100%[========================================================================================================== =================================================>] 9,912,422 2.09m/s in 6.7s
- 2014-12-02 01:24:33 (1.42 MB/s)-"train-images-idx3-ubyte.gz" saved [9912422/9912422]
- --2014-12-02 01:24:33--http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
- Resolving yann.lecun.com ... 128.122.47.89
- Connecting to yann.lecun.com|128.122.47.89|:80 ... Connected.
- HTTP request sent, awaiting response ... OK
- length:28881 (28K) [Application/x-gzip]
- Saving to: "Train-labels-idx1-ubyte.gz"
- 100%[========================================================================================================== =================================================>] 28,881 42.0k/s in 0.7s
- 2014-12-02 01:24:34 (42.0 kb/s)-"train-labels-idx1-ubyte.gz" saved [28881/28881]
- --2014-12-02 01:24:34--http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
- Resolving yann.lecun.com ... 128.122.47.89
- Connecting to yann.lecun.com|128.122.47.89|:80 ... Connected.
- HTTP request sent, awaiting response ... OK
- length:1648877 (1.6M) [Application/x-gzip]
- Saving to: "T10k-images-idx3-ubyte.gz"
- 100%[========================================================================================================== =================================================>] 1,648,877 552k/s in 2.9s
- 2014-12-02 01:24:39 (552 kb/s)-"t10k-images-idx3-ubyte.gz" saved [1648877/1648877]
- --2014-12-02 01:24:39--http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
- Resolving yann.lecun.com ... 128.122.47.89
- Connecting to yann.lecun.com|128.122.47.89|:80 ... Connected.
- HTTP request sent, awaiting response ... OK
- length:4542 (4.4K) [Application/x-gzip]
- Saving to: "T10k-labels-idx1-ubyte.gz"
- 100%[========================================================================================================== =================================================>] 4,542 19.8k/s in 0.2s
- 2014-12-02 01:24:40 (19.8 kb/s)-"t10k-labels-idx1-ubyte.gz" saved [4542/4542]
- Unzipping ...
- Done.
- $ ls
- get_mnist.sh t10k-images-idx3-ubyte t10k-labels-idx1-ubyte train-images-idx3-ubyte train-labels-idx1-ubyte
If you download a problem can be obtained from my resources, url http://download.csdn.net/detail/kkk584520/8213463.
The original dataset is a binary file and needs to be converted to LEVELDB or Lmdb to be Caffe recognized. The conversion Format tool is already integrated in the Caffe code, see Caffe_root/examples/mnist/convert_mnist_data.cpp
And Caffe_root/examples/cifar10/convert_cifar_data.cpp, if you are unfamiliar with leveldb or lmdb operations, you can learn from both source codes. We only need to execute two commands in the Caffe_root directory:
./examples/mnist/create_mnist.sh
./examples/cifar10/create_cifar10.sh
Caffe Code Guide (4): Data Set preparation