"Tensorflowxmxnet" SSD Project replication Experience

Source: Internet
Author: User
Tags scalar mxnet

In order to deepen our understanding, I have reproduced the SSD project, based on the original, have the change according to their own understanding,

Project See Github:ssd_realization_tensorflow, Ssd_realization_mxnet

Constructs the idea according to the training main function the sequence of steps, the end of the text is affixed, below we follow this order to briefly introduce the key of each process, want to learn more about the suggestion to read the corresponding chapter of the source code (TF), or see Dr. Li Yu's SSD introduction video (Although not too detailed, But the idea of combining handouts is very clear, see: "MXNet" Tenth play _ Object detection SSD).

Key notes

There are four main parts of SSD architecture, such as network design, search box design, learning target processing and loss function realization.

Network design

The focus is on the normal pre-selected feature layer in the network to add two convolution exits: Classification and regression exits, which correspond to each category score for each of the subsequent search boxes, and 4 coordinate values.

Search box Design

Corresponding network feature layer: Each layer has a number of search boxes, we need to search box position shape information. For the TF version we have saved the center point of each box and HW information, and the MX version we saved is the top left and right bottom two 4 coordinate values, MX more intuitive, but the TF version saves space: A set of boxes corresponding to the same center point, but the search box is not informative, B harmless.

Learning target Processing

The personal feeling is the most tedious, the information we need to include (has been obtained at this time): A set of search boxes (actually refers to all the search box of n4 coordinate values), the picture of the label, the picture of the real frame coordinates (corresponding to the number of labels 4), All we need is to find the search box and the real picture of the label contact,
Get:
Each search box corresponding to the category (and which real box IOU the largest selection of real box categories labeled to the search, that is, there will be a large number of 0 class search box)
The return target for the coordinates of each search box (same as above, 0 for vacancy)
Negative class masks, although there are usually only a few labeled borders in each picture, SSDs generate a large number of anchor frames. It can be imagined that many anchor frames will not frame objects of interest, which means that the IOU of any table frame corresponding to the object of interest is less than a certain threshold value. This will result in a large number of negative class anchor boxes, or the corresponding label 0 anchor frame. There are two points to consider for this type of anchor frame:
1, the loss function of the border prediction should not include the negative class anchor box, because they do not have a corresponding true border
2, because the number of negative class anchor can be much more than others, we can only keep some of them. And it's to keep those predictions that are the least sure that it is negative, that is, to sort the predicted values of Class 0, and to choose which of the difficult negative class anchor boxes with the smallest values.
So we need to use a mask to suppress some of the computed loss.

Loss function

Not much can be said, according to the formula to achieve, the focus is also in the previous step to calculate the mask processing loss function value one step.

Mxnet Training Main function
if __name__ = = ' __main__ ': batch_size = 4 CTX = mx.cpu (0) # CTX = Mx.gpu (0) # box_metric = mx. MAE () Cls_metric = mx.metric.Accuracy () SSD = ssd_mx. Ssdnet () ssd.initialize (ctx=ctx) # mx.init.Xavier (magnitude=2) Cls_loss = util_mx. Focalloss () Box_loss = Util_mx. Smoothl1loss () trainer = Mx.gluon.Trainer (Ssd.collect_params (), ' sgd ', {' learning_rate ': 0.01, ' wd ': 5e-4}) data = Get_iterators (data_shape=304, batch_size=batch_size) for epoch in range: # Rese T data iterators and Metrics Data.reset () Cls_metric.reset () # box_metric.reset () tic = time.t IME () For I, batch in enumerate (data): Start_time = Time.time () x = Batch.data[0].as_in_conte XT (CTX) y = Batch.label[0].as_in_context (CTX) # Changes the 1 placeholder to the background label 0, and the corresponding coordinate box is recorded as [0,0,0,0] y = nd.wh Ere (Y < 0, nd.zeros_like (y), y) with Mx.autograd.record (): # Anchors, check boxcoordinates, [1,n,4] # class_preds, each picture of the detection box classification, [Bs,n,num_cls + 1] # box_preds, each picture check box coordinate prediction, [BS, N *                4] Anchors, class_preds, box_preds = SSD (x, True) # Box_target, detection box convergence target, [BS, n * 4] # Box_mask, hide unwanted background class, [BS, n * 4] # Cls_target, record full detection box real category, [Bs,n] Box_target, box_m Ask, Cls_target = Ssd_mx.training_targets (anchors, class_preds, y) Loss1 = Cls_loss (Class_preds, Cls_target ) Loss2 = Box_loss (Box_preds, Box_target, box_mask) loss = Loss1 + loss2 Loss.bac                Kward () trainer.step (batch_size) If I% 1 = = 0:duration = Time.time ()-start_time Examples_per_sec = batch_size/duration Sec_per_batch = float (duration) forma T_STR = "[*] Step%d, loss=%.2f (%.1f examples/sec; %.3f sec/batch) "Print (format_str% (I, nd.sum (loss). Asscalar (), Examples_per_SEC, Sec_per_batch)) if I% = = 0:ssd.model.save_parameters (' model_mx_{}.params '. Format (Epoch)) 
TensorFlow Training Main function
def main (): Max_steps = batch_size = Adam_beta1 = 0.9 Adam_beta2 = 0.999 Opt_epsilon = 1.0 num_e Pochs_per_decay = 2.0 Num_samples_per_epoch = 17125 Moving_average_decay = None tf.logging.set_verbosity (tf.loggi Ng. DEBUG) with TF.        Graph (). As_default (): # Create Global_step. With Tf.device ("/device:cpu:0"): Global_step = Tf.train.create_global_step () SSD = ssdnet () ssd_a                                        nchors = ssd.anchors # TFR parsing operation is accelerated under the GPU, the effect is not stable DataSet = Tfr_data_process.get_split ('./tfr_data ',                                       ' Voc2012_*.tfrecord ', num_classes=21,            Num_samples=num_samples_per_epoch) with Tf.device ("/device:cpu:0"): # CPU only supports queue operations                  Image, Glabels, gbboxes = Tfr_data_process.tfr_read (DataSet) image, Glabels, gbboxes = Preprocess_img_tf.preprocess_image (image, Glabels, Gbboxes, out_shape= (+)) gclasses, glocalisations, gscores = Ssd.bboxes_encode (Glabels, Gbboxes, ssd_anchors) Batch_shape = [1] + [Len (ssd_anchors)] * 3 # (1,F layer, F layer, F layer) # Training Batches            and queue.                R = Tf.train.batch (# Image, center point category, real box coordinates, score util_tf.reshape_list ([Image, Gclasses, Glocalisations, Gscores]), Batch_size=batch_size, num_threads=4, capacity=5 * batch_size) bat        Ch_queue = Slim.prefetch_queue.prefetch_queue (r, # <-----input format does not actually need to be adjusted capacity=2 * 1)        # Dequeue Batch.  B_image, B_gclasses, b_glocalisations, b_gscores = Util_tf.reshape_list (Batch_queue.dequeue (), Batch_shape) # Reorganization list predictions, localisations, logits, end_points = Ssd.net (B_image, Is_training=true, Weight_deca y=0.00004) ssd.losses (logits, Localisations, b_gclasses, B_glocalisations, B_gscores, match_threshold=.5, negative_ratio=3, Alpha=1, label_smoothing=.0) Update_ops = Tf.get_collection (TF. Graphkeys.update_ops) # =================================================================== # # Configure th        E moving averages. # =================================================================== # if Moving_average_decay:moving_                Average_variables = Slim.get_model_variables () variable_averages = Tf.train.ExponentialMovingAverage (        Moving_average_decay, Global_step) else:moving_average_variables, variable_averages = none, none # =================================================================== # # Configure the optimization procedu        Re. # =================================================================== # with Tf.device ("/device:CPU:0"): # Learnin G_rate node uses CPU (unknown) Decay_steps = Int (num_samples_per_epoch/batch_size * num_epochs_per_decay) learning_rate = Tf.train.exponential_d                                                       Ecay (0.01, Global_step,                                                       Decay_steps, 0.94, # Learning_rate_decay_factor, Staircase=true, Name= '                Exponential_decay_learning_rate ') optimizer = Tf.train.AdamOptimizer (Learning_rate, Beta1=adam_beta1, Beta2=adam_beta2, Epsilon=opt_epsilon) tf.summary.scalar (' L            Earning_rate ', learning_rate) if Moving_average_decay: # Update Ops executed locally by trainer.        Update_ops.append (Variable_averages.apply (moving_average_variables)) # variables to train. Trainable_scopes = None if traInable_scopes is None:variables_to_train = Tf.trainable_variables () else:scopes = [Scope.st                RIP () for the scope in Trainable_scopes.split (', ')] Variables_to_train = [] for the scope in scopes: variables = tf.get_collection (TF. Graphkeys.trainable_variables, Scope) Variables_to_train.extend (VARIABLES) losses = Tf.get_collectio N (TF. graphkeys.losses) regularization_losses = Tf.get_collection (TF. graphkeys.regularization_losses) Regularization_loss = tf.add_n (regularization_losses) loss = Tf.add_n (losse  s) tf.summary.scalar ("loss", loss) Tf.summary.scalar ("Regularization_loss", Regularization_loss) Grad                                                 = Optimizer.compute_gradients (loss, var_list=variables_to_train) grad_updates = optimizer.apply_gradients (Grad, Global_step=global_step) Update_ops.append (grad_updates) # upd Ate_op = Tf.group (*update_ops) with Tf.control_dependencies (update_ops): Total_loss = Tf.add_n ([Loss, Regularization_loss] ) tf.summary.scalar ("Total_loss", Total_loss) # ============================================================        ======= # # kicks off the training. # =================================================================== # gpu_options = tf. Gpuoptions (per_process_gpu_memory_fraction=0.8) config = tf. Configproto (Log_device_placement=false, gpu_options=gpu_options) saver = Tf.train.Sa Ver (max_to_keep=5, keep_checkpoint_every_n_hours=1.0, Write_ve            rsion=2, Pad_step_number=false) if True:import OS import time Print (' Start ... ') Model_path = './logs ' batch_size = Batch_size with TF. Session (config=config) as Sess:summary = tF.summary.merge_all () coord = Tf.train.Coordinator () threads = Tf.train.start_queue_runners ( Sess=sess, Coord=coord) writer = Tf.summary.FileWriter (Model_path, sess.graph) init_op = tf.                Group (Tf.global_variables_initializer (), Tf.local_variables_initializer ())                    Init_op.run () For step in range (max_steps): Start_time = Time.time ()                    Loss_value = Sess.run (total_loss) # loss_value, summary_str = Sess.run ([Train_tensor, Summary_op])                    # writer.add_summary (Summary_str, step) Duration = Time.time ()-start_time If step% = = 0:summary_str = Sess.run (summary) Writer.add_summa  Ry (Summary_str, step) examples_per_sec = Batch_size/duration Sec_per_batch      = Float (duration)                  FORMAT_STR = "[*] Step%d, loss=%.2f (%.1f examples/sec;                    %.3f sec/batch) "Print (format_str% (step, Loss_value, Examples_per_sec, Sec_per_batch))                    # if step% = = 0: # accuracy_step = Test_cifar10 (Sess, Training=false) # Acc.append (' {:. 3f} '. Format (accuracy_step)) # Print (ACC) if step% 500                = = 0 and step! = 0:saver.save (Sess, Os.path.join (Model_path, "Ssd_tf.model"), Global_step=step) Coord.request_stop () coord.join (threads)

"Tensorflowxmxnet" SSD Project replication Experience

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.