Yolo Source Learning (i)

Last Update:2018-07-28 Source: Internet

Author: User

Tags strcmp

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently began to learn YOLO, blog form to record their own learning distance, about installation, background, etc. are not introduced, directly start reading source code:

1. First find the main function in the darknet.c file, see the explanation of the parameters, and if it is YOLO, perform the Run_yolo function:

int main (int argc, char **argv) {//test_resize ("data/bad.jpg");
    Test_box ();
    Test_convolutional_layer ();
        if (ARGC < 2) {fprintf (stderr, "Usage:%s <function>\n", argv[0]);
    return 0;
    } Gpu_index = Find_int_arg (argc, argv, "-i", 0);
    if (Find_arg (argc, argv, "-nogpu")) {gpu_index =-1;
#ifndef GPU gpu_index =-1;
    #else if (gpu_index >= 0) {cuda_set_device (gpu_index);
    #endif if (0 = strcmp (argv[1], "average")) {average (argc, argv);
    else if (0 = = strcmp (argv[1], "YOLO")) {Run_yolo (argc, argv);
    else if (0 = = strcmp (argv[1], "voxel")) {Run_voxel (argc, argv);
    else if (0 = = strcmp (argv[1], "super")) {Run_super (argc, argv);
    else if (0 = = strcmp (argv[1], "detector")) {Run_detector (argc, argv);
        else if (0 = = strcmp (argv[1], "detect")) {Float Thresh = Find_float_arg (argc, argv, "-thresh",. 24); char *filename = (arGC > 4)?
        ARGV[4]: 0;
    Test_detector ("Cfg/coco.data", argv[2], argv[3], filename, Thresh,. 5); else if (0 = = strcmp (argv[1], "Cifar")) {Run_cifar (argc, argv);

2. Go to the Run_yolo, enter the different functions according to the second argument, and first go to test to see the Test_yolo function:

void Test_yolo (char *cfgfile, Char *weightfile, char *filename, float thresh) {Image **alphabet = Load_alphabet ();
    Network Net = parse_network_cfg (Cfgfile);
    if (weightfile) {load_weights (&net, weightfile);
    } detection_layer L = net.layers[net.n-1];
    Set_batch_network (&net, 1);
    Srand (2222222);
    clock_t time;
    Char buff[256];
    char *input = buff;
    Int J;
    float nms=.4;
    Box *boxes = calloc (L.SIDE*L.SIDE*L.N, sizeof (box));
    float **probs = calloc (L.SIDE*L.SIDE*L.N, sizeof (float *));
    for (j = 0; j < L.SIDE*L.SIDE*L.N; ++j) probs[j] = calloc (l.classes, sizeof (float *));
        while (1) {if (filename) {strncpy (input, filename, 256);
            else {printf ("Enter Image Path:");
            Fflush (stdout);
            input = fgets (input, 256, stdin);
            if (!input) return;
        Strtok (input, "\ n");
        Image im = Load_image_color (input,0,0); Image sized = ReSize_image (IM, NET.W, net.h);
        float *x = sized.data;
        Time=clock ();
        Network_predict (NET, X);
        printf ("%s:predicted in%f seconds.\n", Input, SEC (Clock ()-time));
        Get_detection_boxes (L, 1, 1, Thresh, probs, boxes, 0);
        if (NMS) Do_nms_sort (boxes, probs, L.SIDE*L.SIDE*L.N, l.classes, NMS);
        Draw_detections (IM, L.SIDE*L.SIDE*L.N, thresh, boxes, probs, Voc_names, Alphabet, 20);
        Draw_detections (IM, L.SIDE*L.SIDE*L.N, thresh, boxes, probs, Voc_names, Alphabet, 20);
        Save_image (IM, "predictions");

        Show_image (IM, "predictions");
        Free_image (IM);
Free_image (sized);
        #ifdef OPENCV cvwaitkey (0);
Cvdestroyallwindows ();
    #endif if (filename) break;
 }
}

The first line is a function that loads a picture, regardless of the first. The second function, in terms of name, is a process of constructing a network based on a CFG file, and returns the network variable, so what is network this struct? Look at its statement:

typedef struct network{
    float *workspace;
    int n;//Network layer
    int batch;//Batch sample number, combined with subdivision using
    int *seen;//already processed sample number
    float epoch;
    int subdivisions;
    float momentum;
    float decay;
    Layer *layers;//each layer
    int outputs;
    float *output;
    Strategies of learning_rate_policy policy;//learning rate

    float learning_rate;//learning efficiency
    float gamma;
    float scale;
    float Power;
    int time_steps;
    int step;
    int max_batches;
    float *scales;
    int   *steps;
    int num_steps;
    int burn_in;

    int Adam;
    float B1;
    float B2;
    float EPS;

    int inputs;
    int h, W, C;
    int max_crop;
    int min_crop;
    float angle;
    float aspect;
    float exposure;
    float saturation;
    float hue;

    int gpu_index;
    Tree *hierarchy;

(The meaning of many variables is still not fully understood, first digging a hole here.) ）
Enter the Parse_network_cfg function in the parser.c file:

The first method is read_cfg, first let's look at the contents of the CFG file:

You will see that the CFG file is a paragraph, and the first paragraph is net, followed by the parameters of the rows. Code: First set a list variable (basic list structure), then declare a section:

typedef struct{
    Char *type;
    List *options;
} Section

A section is composed of a string and a linked list, which corresponds to the data for a segment of the CFG file. A row of rows reads the data, and the strip function removes the space-wrapping from a line of text that is read. is to talk about the CFG content read to a list, each element of the list is another type description string and a list, where the element of the list is a KVP structure (that is, some variables and their values in the network, and a used variable that indicates whether it was used). The meaning of a paragraph in the CFG refers to a layer of the network, which contains multiple attributes (KVP). Note the first paragraph of the CFG file must be net layer, will be sentenced to Is_network ().
The network is then set according to the "NET" list (all the properties on the ground level), noting that the batch value is equal to the batch/subdivisions value. NET segment should be to declare all the used attributes in the entire network structure.
3, after that is the different layer parameter setting, if is the convolution layer (convolutional), enters the parse_convolutional () function,
Then there is the knowledge of the entire network structure. What needs to be known is that YOLO uses convolution neural networks, and the need to further understand the code (by setting the convolution layer based on the parameters) requires a combination of convolution neural network model knowledge.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More