When using this tutorial, there are two main problems:
1, the data is not down.
Python examples/finetune_flickr_style/assemble_data.py--workers=1--images=2000--seed 831486
Run the above instructions, the program does not move inexplicably, also do not download files, the program also did not hang out, as if into a deadlock state.
View source program: assemble_data.py, you can see that assemble_data.py used a lot of multithreading, multiple processes. My solution is to change the source program, do not use the process to download. Also, the download is timed out, exceeding 6s is considered timed out, and then not downloaded.
====================================================================================================
Multi-Threading is used in assemble_data.py, and the source code for multiple processes is as follows:
Pool = multiprocessing. Pool (processes=num_workers)
Map_args = Zip (df[' image_url '), df[' Image_filename '])
results = Pool.map ( Download_image, Map_args)
===================================================================================================
My revised source code is as follows:
#!/usr/bin/env python3 "" "Form a subset of the Flickr Style data, download images to dirname, and write Caffe Imagesdatal
Ayer training file. "" Import OS import urllib import hashlib import argparse import numpy as NP import pandas as PD from skimage import IO I
mport multiprocessing Import Socket # Flickr Returns a special image if the request is unavailable. MISSING_IMAGE_SHA1 = ' 6a92790b1c2a301c6e7ddef645dca1f53ea97ac2 ' example_dirname = Os.path.abspath (Os.path.dirname (_ _file__)) Caffe_dirname = Os.path.abspath (Os.path.join (Example_dirname, '..
/..')) Training_dirname = Os.path.join (caffe_dirname, ' Data/flickr_style ') def download_image (args_tuple): "For use with MU Ltiprocessing map.
Returns filename on fail. " Try:url, filename = args_tuple if not os.path.exists (filename): Urllib.urlretrieve (URL, filen
AME) with open (filename) as F:assert hashlib.sha1 (F.read ()). Hexdigest ()!= MISSING_IMAGE_SHA1 Test_read_image = iO.imread (filename) return True except Keyboardinterrupt:raise Exception () # multiprocessing doesn ' t Catch keyboard exceptions Except:return False def mydownload_image (args_tuple): "For use with Multiproce Ssing map.
Returns filename on fail. " Try:url, filename = args_tuple if not os.path.exists (filename): Urllib.urlretrieve (URL, filen
AME) with open (filename) as F:assert hashlib.sha1 (F.read ()). Hexdigest ()!= MISSING_IMAGE_SHA1 Test_read_image = io.imread (filename) return True except Keyboardinterrupt:raise Exception () # mult Iprocessing doesn ' t catch keyboard exceptions Except:return False if __name__ = ' __main__ ': parser = Argparse. Argumentparser (description= ' Download a subset of Flickr Style to a directory ') parser.add_argument ('
-S ', '--seed ', Type=int, default=0, help= "Random Seed") parser.add_argument ( ' I ', '--images ', Type=int, Default=-1, help= "number of images to use ( -1 for all [default])", parser. Add_argument (' W ', '--workers ', Type=int, default=-1, help= "num workers used to download.-X Images
(all-x) Cores [-1 default]. " Parser.add_argument (' l ', '--labels ', Type=int, default=0, help= "If set to a positive value, only sample
Images from the labels. "
args = Parser.parse_args () np.random.seed (args.seed) # Read data, shuffle order, and subsample. Csv_filename = Os.path.join (example_dirname, ' flickr_style.csv.gz ') df = pd.read_csv (Csv_filename, index_col=0, Compre ssion= ' gzip ') df = df.iloc[np.random.permutation (df.shape[0])] if args.labels > 0:df = df.loc[df[' Labe
L '] < Args.labels] If args.images > 0 and Args.images < df.shape[0]: DF = df.iloc[:args.images]
# make directory to images and get local filenames. If TraininG_dirname is None:training_dirname = Os.path.join (caffe_dirname, ' data/flickr_style ') Images_dirname = Os.pat H.join (Training_dirname, ' images ') if not os.path.exists (images_dirname): Os.makedirs (images_dirname) df[' Image_filename '] = [Os.path.join (Images_dirname, _.split ('/') [-1]) for _ in df[' Image_url ']] # Download
Images.
Num_workers = args.workers if num_workers <= 0:num_workers = multiprocessing.cpu_count () + num_workers Print (' downloading {} images with {} workers ... '. Format (df.shape[0), num_workers) #pool = MULTIPROCESSING.P Ool (processes=num_workers) Map_args = Zip (df[' image_url '], df[' image_filename ']) #results = Pool.map (download_imag E, Map_args) socket.setdefaulttimeout (6) results = [] for item in map_args:value = Mydownload_image (i
TEM) Results.append (value) if value = = False:print ' flase ' else:
print ' 1 ' # only keep rows with valid images, and write out training file lists.
Print len (results) df = Df[results] for split in [' Train ', ' Test ']: SPLIT_DF = df[df[' _split '] = = Split] filename = Os.path.join (Training_dirname, ' {}.txt '. Format (split)) split_df[[' Image_filename ', ' label ']].to_ CSV (filename, sep= ', Header=none, index=none) print (' Writing train/val for {} successfully I Mages. '.
Format (Df.shape[0])
The main changes are as follows:
1, #!/usr/bin/env Python3 use Python3
2,
#pool = multiprocessing. Pool (processes=num_workers)
Map_args = Zip (df[' image_url '), df[' Image_filename '])
#results = Pool.map ( Download_image, Map_args)
socket.setdefaulttimeout (6)
results = [] for
item in Map_args:
value = Mydownload_image (item)
results.append (value)
if value = = False:
print ' flase '
else:
print ' 1 '
# only keep rows with valid images, and write out training file lists.
Print Len (results)
Use single thread downloads, no multithreading, multiple process downloads. Also, the timeout for setting up the connection is 6s,socket.setdefaulttimeout (6).
Through the above improvements, you can download the data down.
===================================================================================================
2,
In the Run command:
./build/tools/caffe Train-solver Models/finetune_flickr_style/solver.prototxt-weights Models/bvlc_reference_ Caffenet/bvlc_reference_caffenet.caffemodel
An error was encountered:
Failed to parse Netparameter File:models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
The reason for the error is that our incoming data bvlc_reference_caffenet.caffemodel is not binary.
Reason: Because I was under the Win7, put Bvlc_reference_ Caffenet.caffemodel download, and then use WINSCP transmission to the server, directly on the server using wget download, the speed is too slow, but in the process of transmission WINSCP Bvlc_reference_ The Caffenet.caffemodel format has been tampered with, causing the Bvlc_reference_caffenet.caffemodel not to be binary.
Solution, the WINSCP transmission format is set to binary, then the problem can be solved.
For more details see blog: http://blog.chinaunix.net/uid-20332519-id-5585964.html