The problem of medical image recognition
If CNN is applied to medical images, the primary problem is the lack of training data. Because the training data of CNN need to have category label, this usually need expert to mark by hand. It would be unthinkable to mark millions of training images, such as imagenet, on a large scale.
The principle of transfer training is that some features are universal in different training data sets. For CNN, the first layer is the extraction of local characteristics, in the subsequent layer through the next sampling to expand the sensing region, and then the layer of the future more aware of the region, the resulting features are more abstract. The features in the first few layers are usually not directly related to a particular classification task, but are similar to the Gabor Filter, Edge, and direction-related features. These characteristics are common, so you can train on a dataset and apply it to a similar dataset. Of course, if the trained features are specific to a training data set or a recognition task, it may not be effective to transfer learning with it.
For medical images, it is not easy to get large-scale training data, so can we use transfer learning to help medical image recognition by using ready-made imagenet images? Images in imagenet (two-dimensional, color) there is no medical image, including some such as birds, cats, dogs, helicopters and other objects identified, and medical images (two-dimensional or three-dimensional, non-color) is very different. If the answer is yes, it is a very exciting thing.
The effect of transfer learning using imagenet
Hoo-chang Shin in the NIH; Holger R Roth and others have studied this issue in a recent article (download link). Its full name is: Deep convolutional neural Networks for computer-aided detection:cnn architectures, Dataset characteristics and Trans Fer learning.
In addition to studying the above questions, the article compares the performance of Cifarnet (2009), AlexNet (2012) and Googlenet (2014) in a situation where there is a different amount of training data than a complex network structure. The structure of the three networks is as follows:
The medical images studied in this paper are used for the detection of thoracic and abdominal lymph nodes (three-dimensional) and the classification of pulmonary diseases (two-dimensional) in CT images. How to combine color two-dimensional image with medical image? This article uses two tips:
For three-dimensional CT images, the three two-dimensional images of the coronal plane, sagittal plane and cross section of a certain point are combined as three channels of RGB to make it compatible with color images. On the two-dimensional CT images, three different CT gray windows were used to obtain three images, which were combined into color images.
The results of the experiment are as follows. As can be seen, without using transfer learning (Random initialization, RI), Alexnet is simpler than googlenet, but the effect is better than googlenet, because there are too many googlenet parameters , the training data is not enough to lead to overfitting, so its generalization ability decreases, thus the classification accuracy is reduced. With the use of transfer Learning (TL), the performance of googlenet improved a lot, the effect is better than alexnet.
The performance comparisons of the Random initialization and transfer learning during the training process are as follows:
Visible transfer learning reduces the error on the test data and improves the accuracy of the classification.
Take a look at transfer learning to learn those traits:
Shows the features learned in the first layer of CNN. It can be seen that without the use of transfer learning, the characteristics of the single CT image are shown to be more blurred, while the corresponding features of CNN using the transfer learning include some edge-related features, These are actually learned from imagenet, but they help to classify and identify CT images.
-
Top
-
0
-
Step
ImageNet && Medical Image recognition