ImpressionsToday, I tested the model of my own training, and YOLOv2 done a comparison, the detection is correct, YOLOv2 version of the accuracy is not high, but there are a lot of SSD did not detect, recall rate is not high. Note that the SSD environment is Python3, and running on the python2 will be problematic. TENSORFLOW-GPU, OPENCV installation reference my blog: SSD environment installation
1 Making data setsThe most troublesome is the production of VOC data set, I use the company's data set generator produced a lot of pictures, the total amount of about 25000. In the VOC format, put the picture in the Jpegimages directory, the XML file in the annotations directory, and then use the program to generate Train.txt, Test.txt, Trainval.txt, val.txt four files is enough. The code to generate these txt is as follows:
Import OS import random xmlfilepath=r '/home/whsyxt/downloads/ssd-tensorflow/voc2007/annotations ' savebasepath=r Home/whsyxt/downloads/ssd-tensorflow "trainval_percent=0.8 train_percent=0.7 total_xml = os.listdir (xmlfilepath) num =len (total_xml) list=range (num) tv=int (num*trainval_percent) tr=int (tv*train_percent) trainval= random.sample (list , TV) train=random.sample (trainval,tr) print ("Train and Val Size", TV) print ("Traub Suze", tr) ftrainval = open (Os.path. Join (Savebasepath, ' voc2007/imagesets/main/trainval.txt '), ' w ') ftest = open (Os.path.join (Savebasepath, ' voc2007/
Imagesets/main/test.txt '), ' w ') Ftrain = open (Os.path.join (Savebasepath, ' voc2007/imagesets/main/train.txt '), ' W ') fval = open (Os.path.join (Savebasepath, ' voc2007/imagesets/main/val.txt '), ' W ') for I-in list:name=total_xml[i][:
-4]+ ' \ n ' If I in Trainval:ftrainval.write (name) if I-train:ftrain.write (name) Else:fval.write (name)
Else:ftest.write (name) ftrainval.close () Ftrain.close () Fval.close () ftest. Close ()
Readers can change them in their own way.
2 VOC Turn tfrecordsVOC format data set, we need to transform the dataset into Tfrecords, so that the program can run, first of all, we need to modify the source code, DATASETS\PASCALVOC_ common.py, the operation is very simple, you fill in your category on the line, the rest of the tube, look at my example, I put the original 16 classes into 3 categories:
"" "
voc_labels = {
' none ': (0, ' Background '),
' Aeroplane ': (1, ' Vehicle '),
' bicycle ': (2, ' Vehicle '), c4/> ' Bird ': (3, ' Animal '),
' Boat ': (4, ' Vehicle '),
' Bottle ': (5, ' Indoor '),
' bus ': (6, ' Vehicle '),
' Car ': (7, ' Vehicle '),
' Cat ': (8, ' Animal '),
' chair ': (9, ' Indoor '),
' cow ': (+, ' Animal '),
' Diningtable ': (One, ' indoor '),
' dog ': (A, ' Animal '),
' horse ': (+, ' Animal '),
' motorbike ': (14, ' Vehicle '),
' person ': ("person"),
' pottedplant ': ("indoor"),
' sheep ': (M, ' Animal '),
' Sofa ': (+, ' indoor '),
' Train ': (' Vehicle '),
' Tvmonitor ': (' Indoor '),
}
""
voc_ LABELS = {
' none ': (0, ' Background '),
' person ': (1, ' person '),
' car ': (2, ' car '),
}
That's fine. Then jump to the Ssd-tensorflow directory, for tfrecords operations, I run the following command:
dataset_dir=voc2007/
output_dir=tfrecords/
python3 tf_convert_data.py \
--dataset_name=pascalvoc \
--dataset_dir=${dataset_dir} \
--output_name=voc_2007_train \
--output_dir=${output_dir}
3 TrainingThis allows for training, and the command to run is:
Dataset_dir=tfrecords
train_dir=logs/
checkpoint_path=./checkpoints/ssd_300_vgg.ckpt
python3 train_ ssd_network.py \
--train_dir=${train_dir} \
--dataset_dir=${dataset_dir} \
--dataset_name= pascalvoc_2007 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--checkpoint_path=${ Checkpoint_path} \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=16
4 ForecastI mainly run video, I run the video of the prediction code and run the command also provided to you for reference: command:
Code:
#coding =utf-8 import OS import math import random import numpy as NP import TensorFlow as TF import cv2 slim = Tf.contr Ib.slim Import Matplotlib.pyplot as PLT import matplotlib.image as mpimg import sys sys.path.append ('. /') from Nets import ssd_vgg_300, Ssd_common, np_methods to preprocessing import ssd_vgg_preprocessing from notebooks I Mport Visualization # TensorFlow Session:grow memory when needed.
TF, do not with all my GPU MEMORY!!! Gpu_options = tf. Gpuoptions (allow_growth=true) config = tf. Configproto (Log_device_placement=false, gpu_options=gpu_options) isess = tf.
InteractiveSession (config=config) # Input placeholder. Net_shape = (Data_format) = ' NHWC ' img_input = Tf.placeholder (Tf.uint8, shape= (None, none, 3)) # Evaluation Pre-p
Rocessing:resize to SSD net shape. Image_pre, Labels_pre, bboxes_pre, bbox_img = Ssd_vgg_preprocessing.preprocess_for_eval (Img_input, none, none, Net_sh Ape, Data_format, resize=ssd_vgg_preprocessing. resize.warp_resize) image_4d = Tf.expand_dims (Image_pre, 0) # Define the SSD model. Reuse = True if ' ssd_net ' in locals () Else None-ssd_net = Ssd_vgg_300.ssdnet () with Slim.arg_scope (Ssd_net.arg_scope (data_ Format=data_format): Predictions, Localisations, _, _ = Ssd_net.net (image_4d, Is_training=false, reuse=reuse) # Rest
Ore SSD model. Ckpt_filename = ' finetune_log/model.ckpt-41278 '//modified for your model path #ckpt_filename = ' Checkpoints/ssd_300_vgg.ckpt ' Isess.run (Tf.global_variables_initializer ()) saver = Tf.train.Saver () saver.restore (isess, ckpt_filename) # SSD default anchor Bo
Xes.
Ssd_anchors = Ssd_net.anchors (net_shape) # Main image processing routine.
Def process_image (IMG, select_threshold=0.5, nms_threshold=.45, net_shape=): # Run SSD network.
Rimg, Rpredictions, rlocalisations, rbbox_img = Isess.run ([image_4d, Predictions, localisations, bbox_img], FEED_DICT={IMG_INPUT:IMG}) # Get classes and bboxes from the net Outputs.
Rclasses, rscores, rbboxes = Np_methods.ssd_bboxes_select (rpredictions, rlocalisations, Ssd_anchors, Select_threshold=select_threshold, Img_shape=net_shape, num_classes=21, decode=true) rbboxes = Np_method S.bboxes_clip (rbbox_img, rbboxes) rclasses, rscores, rbboxes = Np_methods.bboxes_sort (rclasses, Rscores, Rbboxes, Top_
k=400) rclasses, rscores, rbboxes = Np_methods.bboxes_nms (rclasses, Rscores, Rbboxes, Nms_threshold=nms_threshold) # Resize bboxes to original image shape.
Note:useless for resize.warp! Rbboxes = Np_methods.bboxes_resize (rbbox_img, rbboxes) return rclasses, Rscores, rbboxes def bboxes_draw_on_img (IMG,
Classes, scores, bboxes, color=[255, 0, 0], thickness=2): shape = Img.shape for I in range (bboxes.shape[0)): Bbox = bboxes[i] #color = colors[classes[i]] # Draw bounding box ... p1 = (int (bbox[0) * shape[0 ]), int (bbox[1] * shape[1])) P2 = (int (bbox[2) * shape[0]), INT (bbox[3] * shape[1]) cv2.rectangle (IMG, p1[::-1], p2[::-1], color, thickness) # Draw Text ... s = '%s/%.3f '% (Classes[i], scores[i]) P1 = (p1[0]-5, p1[1]) Cv2.puttext (IMG, S, p1[::-1], Cv2. Font_hershey_duplex, 0.4, color, 1) cap = Cv2. Videocapture ("Dji_0008.mov")//modified for your path #cap = Cv2. Videocapture (0) # Define the codec and create Videowriter object #fourcc = Cv2.cv.FOURCC (* ' XVID ') FourCC = Cv2. VIDEOWRITER_FOURCC (* ' XVID ') out = Cv2. Videowriter (' Output1.avi ', FourCC, (1280, 720)) num=0 while cap.isopened (): # Get a frame rval, frame = CA P.read () # Save a Frame if rval==true: # frame = Cv2.flip (frame,0) rclasses, Rscores, Rbboxes=proce Ss_image (frame) bboxes_draw_on_img (frame,rclasses,rscores,rbboxes) print (rclasses) Out.write (frame ) num=num+1 print (num) Else:break # show A Frame cv2.imshow ("capture", frame) if Cv2.waitkey (1) & 0xFF = Ord(' Q '): Break Cap.release () out.release () cv2.destroyallwindows ()
Reference Documents[1] ssd-tensorflow. Https://github.com/balancap/SSD-Tensorflow