Caffe Lmdb Interface Processing multiple label (multi-label) Data _caffe

Source: Internet
Author: User

The data interface of Caffe mainly has original image (ImageData), HDF5, Lmdb/leveldb. Since the Caffe Lmdb interface only supports but label, for multiple label tasks, it is often necessary to use HDF5.

However, Caffe for HDF5 data, the entire H5 file needs to be read in advance, which is not a problem for small data, and it saves the IO overhead of training in a single read memory. However, for the large amount of data, memory may not fit the entire H5 file, you need to divide into a few small H5 files. This can be achieved on the one hand not elegant, on the other hand training needs to be kept in turn to read H5 files. One possible solution is to put the image data into the Lmdb,label data into the H5 file, Prototxt inside the label and data from two data layer respectively. But a person thinks such realization also is not good to see, after all, the code inside wants to do HDF5 and Lmdb storage.

A more direct approach has recently been seen on the web, combining Python's Lmdb library and Caffe's Python interface caffe.io.array_to_datum to store image data and labels in two lmdb files, respectively. And for the storage of good lmdb, and how to write prototxt inside the DataLayer to read it. The current Caffe DataLayer, indicating Lmdb as backend, the default first top is to store lmdb when the datum of data, the second top is the Datum label, in the following code does not specify the Datum label, so , for data and label Lmdb, write a datalayer, each datalayer the first top is the corresponding Lmdb content. The top blob's name can be defined by itself.
The code is as follows:

def write_lmdb (image_name_list,label_array,lmdb_img_name,lmdb_label_name,resize_image = False): For lmdb_name in [LMD
            B_img_name, Lmdb_label_name]: Db_path = Os.path.abspath (lmdb_name) if Os.path.exists (Db_path):  Shutil.rmtree (db_path) counter_img = 0 Counter_label = 0 Batchsz = fail_cnt = 0 print ("Processing {:d} images and labels ... ". Format (len (image_name_list)) for I in xrange (int (np.ceil) image_name_list (/float SZ)): Image_name_batch = image_name_list[batchsz*i:batchsz* (i+1)] Label_batch = Label_array[batchsz*i:ba tchsz* (I+1),:] Print Label_batch[np.newaxis,np.newaxis,0].dtype raw_input (' R ') IMGs, labels = [],
                [] for Idx,image_name in Enumerate (image_name_batch): img = Skimage.io.imread (image_name) If resize_image==true:img = Skimage.transform.resize (img, (96,96)) imgs.append (img) Db_imgs =Lmdb.open (Lmdb_img_name, Map_size=1e12) with Db_imgs.begin (write=true) as Txn_img:for img in IMGs: Datum = Caffe.io.array_to_datum (Np.expand_dims (IMG, axis=0)) Txn_img.put ("{: 0>10d}". forma T (counter_img), Datum. Serializetostring ()) counter_img + = 1 Print ("Processed {:d} images". Format (counter_img)) d B_labels = Lmdb.open (Lmdb_label_name, Map_size=1e12) with Db_labels.begin (write=true) as Txn_label:fo R idx in range (Label_batch.shape[0]): Datum = Caffe.io.array_to_datum (label_batch[np.newaxis,np.newaxis,id X]) Txn_label.put ("{: 0>10d}". Format (Counter_label), Datum. Serializetostring ()) Counter_label + = 1 Print ("Processed {:d} labels". Format (Counter_label)) p Rint fail_cnt, ' images fail reading ' Db_imgs.close () db_labels.close ()

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.