A survey on the problem of class disequilibrium in convolution neural networks

Source: Internet
Author: User
The authors of this paper take two typical imbalances as examples, this paper systematically studies and compares various methods to solve the problem of category imbalance in CNN, and makes experiments on three common data sets Minist, CIFAR-10 and Imagenet, and obtains the comprehensive result, which is rich in reference and instructive significance.


Thesis Link: https://arxiv.org/abs/1710.05381


Absrtact: In this paper, we systematically study the effect of class imbalance in convolution neural networks on classification performance, and compare some methods that are often used to solve the problem. Category imbalance is a common problem, although this problem is widely studied in classification machine learning, however, there are few systematic studies available in the field of depth learning. In our study, we studied the effect of class imbalance on performance by using three sets of benchmark tests in order of complexity, and compared several methods used to solve this problem, these three datasets were: Minist, CIFAR-10 and Imagenet, 4 Common solutions are: over-sampling (oversampling, equivalent to interpolation), lower sampling (downsampling, equivalent to compression), two-stage training (two-phase training), and Threshold (threholding), The threshold value can compensate for the Transcendental category probability. Because global accuracy is difficult to determine in unbalanced data, our main evaluation indicator is the area below the ROC curve (Roc AUC). From our experiment we can draw the following conclusions: (i) the unbalanced data will bring damage to the classification performance; (ii) in the solution of the problem of unbalanced data, the dominant factor is sampling, which is present in almost all analytical scenarios, and (iii) sampling should be used in situations where there is a need to completely eliminate imbalances, The lower sampling may be better than the only way to eliminate the imbalance in some degree; (iv) Unlike some traditional machine learning models, over sampling does not necessarily result in a convolution neural network being fitted; (v) when interested in the total number of examples that are correctly categorized, in order to compensate for a priori category probability, You should use a threshold method. 1 Introduction

Convolution neural Network (CNN) has been paid more and more attention in many machine learning applications, and has recently contributed many advanced technical achievements for computer vision, including target detection, image classification, image segmentation and so on. Convolution neural networks are also widely used in natural language processing and speech recognition, where CNN either replaces traditional technologies or helps improve the traditional machine learning model [1]. The convolution neural network sets up automatic feature extractor and classifier in the model, which is the biggest difference between it and traditional machine learning technology. This feature allows the convolution neural network to learn layered representations [2]. The standard convolution neural network consists of a fully connected layer, multiple modules containing a convolution layer, activating function layer, and a maximum pool layer [3,4,5]. Convolution neural networks are inherently complex, so it takes a lot of computing to train and test the network, which is usually solved with the help of a modern GPU.


In real-life applications, a common problem based on depth learning is that in a training set, some classes have a larger sample size than other classes. This difference is called the category imbalance. There are many such examples in the following areas: Computer vision [6,7,8,9,10], medical diagnostics [11,12], fraud detection [13], and other areas [14,15,16], which are important in these areas. A category, such as a cancer patient, can have a sample frequency 1000 times times smaller than other categories, such as non-cancer patients. It has been established that the category imbalance problem can cause serious performance damage to the traditional classifier [17], including multilayer perceptron [18]. It affects both the convergence of the training model and the generalization ability of the test set. Although this problem also affects deep learning, there is no systematic study of the problem.


The method of coping with the unbalanced problem has some research results [19,17,20,18] in the traditional machine learning model. The most direct and most common is the use of some sampling methods, which directly to the data itself (rather than the model) processing to improve the balance of the data. One of the most widely used, and proved to be, a more robust approach is over sampling (oversampling) [21]. The other is the next sample (downsampling). A simpler version is simply to randomly remove the samples from most classes [17], which is called random majority sampling (random majority downsampling). The problem of category imbalance can also be dealt with at the classifier level. In this case, the learning algorithm should be modified, for example, by introducing a different weighting coefficient [22] to the sample of the false classification, or by specifically adjusting the priori category probability [23].


Previous studies have shown some of the results associated with sensitive learning (cost sensitive learning) in a deep neural network [24,25,26]. New loss functions used in neural network training have also been developed [27]. Recently, a new method for convolution neural networks is proposed, which is to train the network in two phases, first training the neural network on the equilibrium data, and then fine-tuning the output layer [28]. Although there is no systematic analysis of the imbalance in depth learning and no available methods to deal with this problem, some of the methods employed by the researchers may be addressing this problem, based on intuition, some intermediate test results and some systematic results available in traditional machine learning. According to our research on literature, the most widely used method in depth learning is over sampling.


The other content of this article is organized as follows: The 2nd section summarizes some methods to solve the problem of imbalance; The 3rd section describes our experimental settings, gives details on comparison methods, datasets, and test models used, and section 4th shows the experimental results and comparison methods; Finally, the work of the entire paper is summarized in section 5th. 2 ways to solve unbalanced problems

The solution to the problem of category imbalance can be divided into two broad categories [29]. The first category is the data-level approach, which processes the training data to change its category distribution. The goal of such a method is to change the dataset to a target that makes the standard training algorithm work. The other category includes the classifier (algorithm) level. These methods keep the training data set unchanged and only adjust the training (or inference) algorithm. In addition, the combination of these two types of methods can also be used. In this section we will outline the two types of methods that are often used, both in classical machine learning models and in deep neural networks.


Figure 1: An example distribution of an unbalanced dataset containing the corresponding parameter values. (A, B): Step imbalance, Parameters: Rho and Μ; (c): Linear disequilibrium, Parameters: Rho.


Table 1: The data is lumped together and the number of each type of image refers to the perfect balance of the dataset used in the experiment. The imagenet image dimension is the result of scaling. 3 Experiment 3.2 The method of solving the unbalanced problem in this paper

We have experimented with 7 methods for solving the problem of class imbalance in convolution neural networks, which include most of the methods used in deep learning: 1. Random few over sampling; 2. Random majority sampling; 3. Two-stage training, on the random sampling data set for the training; 4. Two-stage training, in the random sampling data set on the pre-training; 5. A threshold method using a priori probability; 6. Over-sampling using the threshold method; 7. Use the bottom sampling of the Thresholding method. 3.3 Data sets and models

In our study, a total of 3 benchmark datasets were used: mnist [52],CIFAR-10 [53], and imagenet Large Scale Visual recognition Challenge (ILSVRC) 2012 [54] 。 All data sets are divided into labeled training sets and test sets. For each dataset, we select the unused model, each model has a set of super parameters, which are used in some related literature and good performance. The complexity of the dataset is positively correlated with the complexity of the model. This allows us to draw some conclusions on some simple tasks and to verify how these conclusions are extended to more complex tasks.


Table 2: The architecture of the LeNet-5 convolution neural network used in the Mnist data set experiment.


Table 3: The ALL-CNN network structure used in the CIFAR-10 dataset experiment.


Figure 2: The single residuals Module network architecture in the ResNet used in the ILSVRC-2012 experiment. 4. Results The effects of the imbalance of 4.1 categories on the classification performance, and the comparison of several methods to solve the problem


Figure 3: Multiple ROC AUC Comparisons for each method: (a-c). Mnist, (d-f). CIFAR-10, (d-f). A step unbalanced data with a fixed number of minority categories.


Figure 4: Multiple ROC AUC Comparisons for each method: (a-c). Mnist, (d-f). CIFAR-10, (d-f). A step unbalanced data with a fixed minority of category proportions.


Figure 5: Multi-class ROC AUC comparisons for each method under linear disequilibrium. 4.2 The results on the imagenet dataset


Table 4:imagenet on many types of ROC AUC contrast. 4..4 uses multiple thresholds to improve accuracy scores


Figure 6: The accuracy rate comparison of various methods: (A-C). Mnist, (d-f). CIFAR-10, (d-f). A step unbalanced data with a fixed number of minority categories. 4.5 Reduce the unbalanced rate of data set by sampling and over sampling


Figure 7:mnist Data set, the original imbalance is 1000 (up to the minimum number of categories), through sampling and lower sampling to reduce the imbalance after the ROC AUC contrast. 4.6 The generalization of sampling method


Figure 8: A comparison of the model convergence after the baseline and sampling methods are used respectively. Using CIFAR-10 DataSet, step imbalance, 5 categories, imbalance ratio is 50. 5. Conclusion

In this paper, we study the effect of class disequilibrium on the classification performance of convolution neural networks, and compare the effect of different methods to solve the problem. We define two different types of imbalances and parameterize them, that is, the step imbalance and the linear disequilibrium. The mnist,cifar-10 and Imagenet (ILSVRC-2012) datasets are then set aside to create an artificially unbalanced nature of the dataset. We compare the common sampling method, the basic threshold method and the two-stage training method.


The experimental conclusions related to the imbalance of categories are as follows: The category imbalance can have harmful effects on the classification performance. With the enlargement of task scale, the effect of category Disequilibrium on classification performance is greater. The effect of category imbalance cannot be explained simply by the insufficient number of training samples, which depends on the distribution of various samples.


According to the choice of different methods when dealing with the class unbalanced dataset in convolution neural network, we get the following conclusion: the most outstanding method in most cases is sampling when the ROC AUC is used as evaluation index. For extremely unbalanced ratios, and for most categories of samples, lower sampling has a better effect than sampling. In order to achieve the best accuracy rate, the threshold method should be used to compensate the prior category probability. The most desirable method is the combination of threshold method and over sampling, however, the threshold method should not be combined with the lower sampling method. The sampling method should be applied in cases where the imbalance needs to be completely eliminated, while the lower sampling is more suitable for situations where the imbalance needs to be eliminated to some extent. Unlike some classical machine learning methods, over sampling does not necessarily lead to fitting in convolution neural networks.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.