"Tensorflowxmxnet" SSD Project replication Experience

Last Update:2018-08-31 Source: Internet

Author: User

Tags scalar mxnet

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In order to deepen our understanding, I have reproduced the SSD project, based on the original, have the change according to their own understanding,

Project See Github:ssd_realization_tensorflow, Ssd_realization_mxnet

Constructs the idea according to the training main function the sequence of steps, the end of the text is affixed, below we follow this order to briefly introduce the key of each process, want to learn more about the suggestion to read the corresponding chapter of the source code (TF), or see Dr. Li Yu's SSD introduction video (Although not too detailed, But the idea of combining handouts is very clear, see: "MXNet" Tenth play _ Object detection SSD).

Key notes

There are four main parts of SSD architecture, such as network design, search box design, learning target processing and loss function realization.

Network design

The focus is on the normal pre-selected feature layer in the network to add two convolution exits: Classification and regression exits, which correspond to each category score for each of the subsequent search boxes, and 4 coordinate values.

Search box Design

Corresponding network feature layer: Each layer has a number of search boxes, we need to search box position shape information. For the TF version we have saved the center point of each box and HW information, and the MX version we saved is the top left and right bottom two 4 coordinate values, MX more intuitive, but the TF version saves space: A set of boxes corresponding to the same center point, but the search box is not informative, B harmless.

Learning target Processing

The personal feeling is the most tedious, the information we need to include (has been obtained at this time): A set of search boxes (actually refers to all the search box of n4 coordinate values), the picture of the label, the picture of the real frame coordinates (corresponding to the number of labels 4), All we need is to find the search box and the real picture of the label contact,
Get:
Each search box corresponding to the category (and which real box IOU the largest selection of real box categories labeled to the search, that is, there will be a large number of 0 class search box)
The return target for the coordinates of each search box (same as above, 0 for vacancy)
Negative class masks, although there are usually only a few labeled borders in each picture, SSDs generate a large number of anchor frames. It can be imagined that many anchor frames will not frame objects of interest, which means that the IOU of any table frame corresponding to the object of interest is less than a certain threshold value. This will result in a large number of negative class anchor boxes, or the corresponding label 0 anchor frame. There are two points to consider for this type of anchor frame:
1, the loss function of the border prediction should not include the negative class anchor box, because they do not have a corresponding true border
2, because the number of negative class anchor can be much more than others, we can only keep some of them. And it's to keep those predictions that are the least sure that it is negative, that is, to sort the predicted values of Class 0, and to choose which of the difficult negative class anchor boxes with the smallest values.
So we need to use a mask to suppress some of the computed loss.

Loss function

Not much can be said, according to the formula to achieve, the focus is also in the previous step to calculate the mask processing loss function value one step.

Mxnet Training Main function

if __name__ = = ' __main__ ': batch_size = 4 CTX = mx.cpu (0) # CTX = Mx.gpu (0) # box_metric = mx. MAE () Cls_metric = mx.metric.Accuracy () SSD = ssd_mx. Ssdnet () ssd.initialize (ctx=ctx) # mx.init.Xavier (magnitude=2) Cls_loss = util_mx. Focalloss () Box_loss = Util_mx. Smoothl1loss () trainer = Mx.gluon.Trainer (Ssd.collect_params (), ' sgd ', {' learning_rate ': 0.01, ' wd ': 5e-4}) data = Get_iterators (data_shape=304, batch_size=batch_size) for epoch in range: # Rese T data iterators and Metrics Data.reset () Cls_metric.reset () # box_metric.reset () tic = time.t IME () For I, batch in enumerate (data): Start_time = Time.time () x = Batch.data[0].as_in_conte XT (CTX) y = Batch.label[0].as_in_context (CTX) # Changes the 1 placeholder to the background label 0, and the corresponding coordinate box is recorded as [0,0,0,0] y = nd.wh Ere (Y < 0, nd.zeros_like (y), y) with Mx.autograd.record (): # Anchors, check boxcoordinates, [1,n,4] # class_preds, each picture of the detection box classification, [Bs,n,num_cls + 1] # box_preds, each picture check box coordinate prediction, [BS, N *                4] Anchors, class_preds, box_preds = SSD (x, True) # Box_target, detection box convergence target, [BS, n * 4] # Box_mask, hide unwanted background class, [BS, n * 4] # Cls_target, record full detection box real category, [Bs,n] Box_target, box_m Ask, Cls_target = Ssd_mx.training_targets (anchors, class_preds, y) Loss1 = Cls_loss (Class_preds, Cls_target ) Loss2 = Box_loss (Box_preds, Box_target, box_mask) loss = Loss1 + loss2 Loss.bac                Kward () trainer.step (batch_size) If I% 1 = = 0:duration = Time.time ()-start_time Examples_per_sec = batch_size/duration Sec_per_batch = float (duration) forma T_STR = "[*] Step%d, loss=%.2f (%.1f examples/sec; %.3f sec/batch) "Print (format_str% (I, nd.sum (loss). Asscalar (), Examples_per_SEC, Sec_per_batch)) if I% = = 0:ssd.model.save_parameters (' model_mx_{}.params '. Format (Epoch))

TensorFlow Training Main function

def main (): Max_steps = batch_size = Adam_beta1 = 0.9 Adam_beta2 = 0.999 Opt_epsilon = 1.0 num_e Pochs_per_decay = 2.0 Num_samples_per_epoch = 17125 Moving_average_decay = None tf.logging.set_verbosity (tf.loggi Ng. DEBUG) with TF.        Graph (). As_default (): # Create Global_step. With Tf.device ("/device:cpu:0"): Global_step = Tf.train.create_global_step () SSD = ssdnet () ssd_a                                        nchors = ssd.anchors # TFR parsing operation is accelerated under the GPU, the effect is not stable DataSet = Tfr_data_process.get_split ('./tfr_data ',                                       ' Voc2012_*.tfrecord ', num_classes=21,            Num_samples=num_samples_per_epoch) with Tf.device ("/device:cpu:0"): # CPU only supports queue operations                  Image, Glabels, gbboxes = Tfr_data_process.tfr_read (DataSet) image, Glabels, gbboxes = Preprocess_img_tf.preprocess_image (image, Glabels, Gbboxes, out_shape= (+)) gclasses, glocalisations, gscores = Ssd.bboxes_encode (Glabels, Gbboxes, ssd_anchors) Batch_shape = [1] + [Len (ssd_anchors)] * 3 # (1,F layer, F layer, F layer) # Training Batches            and queue.                R = Tf.train.batch (# Image, center point category, real box coordinates, score util_tf.reshape_list ([Image, Gclasses, Glocalisations, Gscores]), Batch_size=batch_size, num_threads=4, capacity=5 * batch_size) bat        Ch_queue = Slim.prefetch_queue.prefetch_queue (r, # <-----input format does not actually need to be adjusted capacity=2 * 1)        # Dequeue Batch.  B_image, B_gclasses, b_glocalisations, b_gscores = Util_tf.reshape_list (Batch_queue.dequeue (), Batch_shape) # Reorganization list predictions, localisations, logits, end_points = Ssd.net (B_image, Is_training=true, Weight_deca y=0.00004) ssd.losses (logits, Localisations, b_gclasses, B_glocalisations, B_gscores, match_threshold=.5, negative_ratio=3, Alpha=1, label_smoothing=.0) Update_ops = Tf.get_collection (TF. Graphkeys.update_ops) # =================================================================== # # Configure th        E moving averages. # =================================================================== # if Moving_average_decay:moving_                Average_variables = Slim.get_model_variables () variable_averages = Tf.train.ExponentialMovingAverage (        Moving_average_decay, Global_step) else:moving_average_variables, variable_averages = none, none # =================================================================== # # Configure the optimization procedu        Re. # =================================================================== # with Tf.device ("/device:CPU:0"): # Learnin G_rate node uses CPU (unknown) Decay_steps = Int (num_samples_per_epoch/batch_size * num_epochs_per_decay) learning_rate = Tf.train.exponential_d                                                       Ecay (0.01, Global_step,                                                       Decay_steps, 0.94, # Learning_rate_decay_factor, Staircase=true, Name= '                Exponential_decay_learning_rate ') optimizer = Tf.train.AdamOptimizer (Learning_rate, Beta1=adam_beta1, Beta2=adam_beta2, Epsilon=opt_epsilon) tf.summary.scalar (' L            Earning_rate ', learning_rate) if Moving_average_decay: # Update Ops executed locally by trainer.        Update_ops.append (Variable_averages.apply (moving_average_variables)) # variables to train. Trainable_scopes = None if traInable_scopes is None:variables_to_train = Tf.trainable_variables () else:scopes = [Scope.st                RIP () for the scope in Trainable_scopes.split (', ')] Variables_to_train = [] for the scope in scopes: variables = tf.get_collection (TF. Graphkeys.trainable_variables, Scope) Variables_to_train.extend (VARIABLES) losses = Tf.get_collectio N (TF. graphkeys.losses) regularization_losses = Tf.get_collection (TF. graphkeys.regularization_losses) Regularization_loss = tf.add_n (regularization_losses) loss = Tf.add_n (losse  s) tf.summary.scalar ("loss", loss) Tf.summary.scalar ("Regularization_loss", Regularization_loss) Grad                                                 = Optimizer.compute_gradients (loss, var_list=variables_to_train) grad_updates = optimizer.apply_gradients (Grad, Global_step=global_step) Update_ops.append (grad_updates) # upd Ate_op = Tf.group (*update_ops) with Tf.control_dependencies (update_ops): Total_loss = Tf.add_n ([Loss, Regularization_loss] ) tf.summary.scalar ("Total_loss", Total_loss) # ============================================================        ======= # # kicks off the training. # =================================================================== # gpu_options = tf. Gpuoptions (per_process_gpu_memory_fraction=0.8) config = tf. Configproto (Log_device_placement=false, gpu_options=gpu_options) saver = Tf.train.Sa Ver (max_to_keep=5, keep_checkpoint_every_n_hours=1.0, Write_ve            rsion=2, Pad_step_number=false) if True:import OS import time Print (' Start ... ') Model_path = './logs ' batch_size = Batch_size with TF. Session (config=config) as Sess:summary = tF.summary.merge_all () coord = Tf.train.Coordinator () threads = Tf.train.start_queue_runners ( Sess=sess, Coord=coord) writer = Tf.summary.FileWriter (Model_path, sess.graph) init_op = tf.                Group (Tf.global_variables_initializer (), Tf.local_variables_initializer ())                    Init_op.run () For step in range (max_steps): Start_time = Time.time ()                    Loss_value = Sess.run (total_loss) # loss_value, summary_str = Sess.run ([Train_tensor, Summary_op])                    # writer.add_summary (Summary_str, step) Duration = Time.time ()-start_time If step% = = 0:summary_str = Sess.run (summary) Writer.add_summa  Ry (Summary_str, step) examples_per_sec = Batch_size/duration Sec_per_batch      = Float (duration)                  FORMAT_STR = "[*] Step%d, loss=%.2f (%.1f examples/sec;                    %.3f sec/batch) "Print (format_str% (step, Loss_value, Examples_per_sec, Sec_per_batch))                    # if step% = = 0: # accuracy_step = Test_cifar10 (Sess, Training=false) # Acc.append (' {:. 3f} '. Format (accuracy_step)) # Print (ACC) if step% 500                = = 0 and step! = 0:saver.save (Sess, Os.path.join (Model_path, "Ssd_tf.model"), Global_step=step) Coord.request_stop () coord.join (threads)

"Tensorflowxmxnet" SSD Project replication Experience

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More