Author: Mu Ling
Date: November 2016.
Blog connection: http://blog.csdn.net/u014540717
In the previous article Training VOC datasets with the YOLOV2 model, we tried to train the VOC dataset with YOLOv2, but I wanted to train my own data set, so YOLOv2 how to do fine-tuning. Let's do it one step at a- 1 preparing data 1.1 building hierarchies
First create a folder under the Darknet/data folder fddb2016, the file hierarchy is as follows
--fddb2016
--annotations
2002_07_19_big_img_130.xml
2002_07_25_big_img_84.xml
2002_08_01_big_ Img_1445.xml
2002_08_08_big_img_277.xml
2002_08_16_big_img_637.xml
2002_08_25_big_img_199.xml
2003_01_01_big_img_698.xml
...
--imagesets
--main
test.txt
trainval.txt
--jpegimages
2002_07_19_big_img_130.jpg
2002_07_25_big_img_84.jpg
2002_08_01_big_img_1445.jpg
2002_08_08_big_img_277.jpg
2002_08_ 16_big_img_637.jpg
2002_08_25_big_img_199.jpg
2003_01_01_big_img_698.jpg
...
--labels1.2 Xml2txt
Because YOLO reads the TXT document, we want to modify the XML benchmark to TXT format, as shown in the following procedure:
Import xml.etree.ElementTree as ET import pickle import os from OS import Listdir, getcwd from Os.path import join import Cv2 #sets =[(' fddb2016 ', ' Train '), (' fddb2016 ', ' Val ')] #classes = ["Aeroplane", "Bicycle", "Bird", "Boat", "Bottle", "bus "," Car "," cat "," chair "," cow "," diningtable "," dog "," horse "," motorbike "," person "," pottedplant "," sheep "," sofa "," tr Ain "," tvmonitor "] classes = [" Face "] def convert (size, box): DW = 1./size[0] DH = 1./size[1] x = (Box[0] + b Ox[1])/2.0 y = (box[2] + box[3])/2.0 w = box[1]-box[0] h = box[3]-box[2] x = x*dw W = w*dw y = Y*dh h = h*dh return (X,Y,W,H) def convert_annotation (W, H, image_id): in_file = open (' fddb2016/annotations/ %s.xml '% image_id) out_file = open (' fddb2016/labels/%s.txt '% image_id, ' W ') print In_file tree=et.parse (in_fi Le) root = tree.getroot () size = root.find (' size ') for obj in Root.iter (' object '): Difficult = obj.fi
nd (' difficult '). Text CLS = Obj.find (' name '). text if CLS not in classes or int (difficult) = = 1:continue cls_id = Classes.index (CLS) Xmlbox = Obj.find (' bndbox ') b = (Float (xmlbox.find (' xmin '). Text), Float (Xmlbox.find ( ' Xmax '), Float (xmlbox.find (' ymin '). Text), Float (xmlbox.find (' Ymax '). Text) BB = Convert ((w,h), b) ou T_file.write (str (cls_id) + "+" ". Join ([Str (a) for a in BB]) + ' \ n ') WD = GETCWD () if not os.path.exists (' fddb2016/la bels/'): Os.makedirs (' fddb2016/labels/') image_ids = open (' Fddb2016/imagesets/main/trainval.txt '). Read (). Strip (). Split () List_file = open (' Fddb2016_train.txt ', ' W ') for image_id in Image_ids:list_file.write ('%s/fddb2016/jpegimages /%s.jpg\n '% (wd, image_id)) image = Cv2.imread ('%s/fddb2016/jpegimages/%s.jpg '% (wd, image_id)) H, W, C = Image.sha PE convert_annotation (W, H, image_id) List_file.close ()
2 Fine Tuning
2.1 modifying. cfg files
If you want to use the 22-layer model to modify the cfg/yolo-voc.cfg, you want to use the 9-layer model to modify CFG/TINY-YOLO-VOC.CFG, the same way, we take yolo-voc.cfg as an example:
Copying a cfg file
$CP cfg/yolo-voc.cfg cfg/yolo-fddb.cfg
Open the Yolo-fddb.cfg file and make the following changes
A. Change learning_rate=0.0001 to learning_rate=0.00005
B. Change max_batches = 45000 to max_batches = 200000
C. Change Classes=20 to Classes=1
D. Change the filters=125 of the last layer [convolutional] layer to filters=30,filters as follows, modify it according to the number of categories of your own data
Filters=num∗ (classes+coords+1) =5∗ (1+4+1) =30
The final results are as follows:
[NET]
batch=64
subdivisions=8
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.0005
max_batches = 200000
policy=steps
steps=100,25000,35000
scales=10,.1,.1
.
[Convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear
[region]
anchors = 1.08, 1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52
bias_match=1
Classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh =. 6
random=0
2.2 Modify Voc.names file
Copy the Voc.names file
$CP Data/voc.names Data/fddb.names
Modify the Fddb.names file with the following results
Face
2.3 Modify Voc.data file
Copy the Voc.data file
$CP Cfg/voc.data Cfg/fddb.data
Modify the Voc.data file with the following results
Classes= 1
Train =/home/usrname/darknet-v2/data/fddb2016_train.txt
valid = valid =/home/ Pjreddie/data/voc/2007_test.txt
names = data/fddb.names
backup =/home/guoyana/my_files/local_install/ Darknet-v2/backup
3 Start Training
YOLOv2 already supports multi-GPU, training with the weights obtained from the VOC dataset, run the following command to get started
./darknet detector train./cfg/fddb.data./cfg/yolo-fddb.cfg Backup/yolo-voc_6000.weights-gpus 0,1,2,3
4 Results
3 There is a problem: the general pre-training model is used in the classification model, rather than with the detection model training. So the above method is still problematic, loss down to 0.1 no longer. In the end, the pre-training model is not used to train the network, and the effect after 18,000 iterations is as follows
(END)