Caffe Tutorial fine-tuning caffenet for style recognition on "Flickr Style" Data encountered problems and solutions

Last Update:2018-07-24 Source: Internet

Author: User

Tags assert random seed sha1

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When using this tutorial, there are two main problems:

1, the data is not down.

Python examples/finetune_flickr_style/assemble_data.py--workers=1--images=2000--seed 831486

Run the above instructions, the program does not move inexplicably, also do not download files, the program also did not hang out, as if into a deadlock state.

View source program: assemble_data.py, you can see that assemble_data.py used a lot of multithreading, multiple processes. My solution is to change the source program, do not use the process to download. Also, the download is timed out, exceeding 6s is considered timed out, and then not downloaded.

====================================================================================================

Multi-Threading is used in assemble_data.py, and the source code for multiple processes is as follows:

    Pool = multiprocessing. Pool (processes=num_workers)
    Map_args = Zip (df[' image_url '), df[' Image_filename '])
    results = Pool.map ( Download_image, Map_args)

===================================================================================================

My revised source code is as follows:

#!/usr/bin/env python3 "" "Form a subset of the Flickr Style data, download images to dirname, and write Caffe Imagesdatal
Ayer training file. "" Import OS import urllib import hashlib import argparse import numpy as NP import pandas as PD from skimage import IO I
mport multiprocessing Import Socket # Flickr Returns a special image if the request is unavailable. MISSING_IMAGE_SHA1 = ' 6a92790b1c2a301c6e7ddef645dca1f53ea97ac2 ' example_dirname = Os.path.abspath (Os.path.dirname (_ _file__)) Caffe_dirname = Os.path.abspath (Os.path.join (Example_dirname, '..
/..')) Training_dirname = Os.path.join (caffe_dirname, ' Data/flickr_style ') def download_image (args_tuple): "For use with MU Ltiprocessing map.
    Returns filename on fail. " Try:url, filename = args_tuple if not os.path.exists (filename): Urllib.urlretrieve (URL, filen
        AME) with open (filename) as F:assert hashlib.sha1 (F.read ()). Hexdigest ()!= MISSING_IMAGE_SHA1 Test_read_image = iO.imread (filename) return True except Keyboardinterrupt:raise Exception () # multiprocessing doesn ' t Catch keyboard exceptions Except:return False def mydownload_image (args_tuple): "For use with Multiproce Ssing map.
    Returns filename on fail. " Try:url, filename = args_tuple if not os.path.exists (filename): Urllib.urlretrieve (URL, filen
        AME) with open (filename) as F:assert hashlib.sha1 (F.read ()). Hexdigest ()!= MISSING_IMAGE_SHA1 Test_read_image = io.imread (filename) return True except Keyboardinterrupt:raise Exception () # mult Iprocessing doesn ' t catch keyboard exceptions Except:return False if __name__ = ' __main__ ': parser = Argparse. Argumentparser (description= ' Download a subset of Flickr Style to a directory ') parser.add_argument ('
      -S ', '--seed ', Type=int, default=0, help= "Random Seed") parser.add_argument (  ' I ', '--images ', Type=int, Default=-1, help= "number of images to use ( -1 for all [default])", parser. Add_argument (' W ', '--workers ', Type=int, default=-1, help= "num workers used to download.-X Images
    (all-x) Cores [-1 default]. " Parser.add_argument (' l ', '--labels ', Type=int, default=0, help= "If set to a positive value, only sample

    Images from the labels. "
    args = Parser.parse_args () np.random.seed (args.seed) # Read data, shuffle order, and subsample. Csv_filename = Os.path.join (example_dirname, ' flickr_style.csv.gz ') df = pd.read_csv (Csv_filename, index_col=0, Compre ssion= ' gzip ') df = df.iloc[np.random.permutation (df.shape[0])] if args.labels > 0:df = df.loc[df[' Labe

    L '] < Args.labels] If args.images > 0 and Args.images < df.shape[0]: DF = df.iloc[:args.images]
    # make directory to images and get local filenames. If TraininG_dirname is None:training_dirname = Os.path.join (caffe_dirname, ' data/flickr_style ') Images_dirname = Os.pat H.join (Training_dirname, ' images ') if not os.path.exists (images_dirname): Os.makedirs (images_dirname) df['  Image_filename '] = [Os.path.join (Images_dirname, _.split ('/') [-1]) for _ in df[' Image_url ']] # Download
    Images.
    Num_workers = args.workers if num_workers <= 0:num_workers = multiprocessing.cpu_count () + num_workers Print (' downloading {} images with {} workers ... '. Format (df.shape[0), num_workers) #pool = MULTIPROCESSING.P Ool (processes=num_workers) Map_args = Zip (df[' image_url '], df[' image_filename ']) #results = Pool.map (download_imag E, Map_args) socket.setdefaulttimeout (6) results = [] for item in map_args:value = Mydownload_image (i
                TEM) Results.append (value) if value = = False:print ' flase ' else:
 print ' 1 '   # only keep rows with valid images, and write out training file lists.
        Print len (results) df = Df[results] for split in [' Train ', ' Test ']: SPLIT_DF = df[df[' _split '] = = Split] filename = Os.path.join (Training_dirname, ' {}.txt '. Format (split)) split_df[[' Image_filename ', ' label ']].to_ CSV (filename, sep= ', Header=none, index=none) print (' Writing train/val for {} successfully I Mages. '.

 Format (Df.shape[0])

The main changes are as follows:

1, #!/usr/bin/env Python3 use Python3

#pool = multiprocessing. Pool (processes=num_workers)
    Map_args = Zip (df[' image_url '), df[' Image_filename '])
    #results = Pool.map ( Download_image, Map_args)
    socket.setdefaulttimeout (6)
    results = [] for
    item in Map_args:
        value = Mydownload_image (item)
        results.append (value)
        if value = = False:
                print ' flase '
        else:
                print ' 1 '
    # only keep rows with valid images, and write out training file lists.
    Print Len (results)

Use single thread downloads, no multithreading, multiple process downloads. Also, the timeout for setting up the connection is 6s,socket.setdefaulttimeout (6).

Through the above improvements, you can download the data down.

===================================================================================================

In the Run command:

./build/tools/caffe Train-solver Models/finetune_flickr_style/solver.prototxt-weights Models/bvlc_reference_ Caffenet/bvlc_reference_caffenet.caffemodel

An error was encountered:

Failed to parse Netparameter File:models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

The reason for the error is that our incoming data bvlc_reference_caffenet.caffemodel is not binary.

Reason: Because I was under the Win7, put Bvlc_reference_ Caffenet.caffemodel download, and then use WINSCP transmission to the server, directly on the server using wget download, the speed is too slow, but in the process of transmission WINSCP Bvlc_reference_ The Caffenet.caffemodel format has been tampered with, causing the Bvlc_reference_caffenet.caffemodel not to be binary.

Solution, the WINSCP transmission format is set to binary, then the problem can be solved.

For more details see blog: http://blog.chinaunix.net/uid-20332519-id-5585964.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More