Abstract: The use of the morphology of the glands to evaluate the malignancy of adenocarcinoma is a routine method of pathology, the accurate segmentation of gland images from anatomical structures is a key step in the quantitative diagnosis of statistics. In this paper, we propose an efficient deep contour perceptual network (Dcan) to solve this challenging problem under a unified multitasking learning framework. In the proposed network, multilevel contextual features from the hierarchical structure are explored to provide an auxiliary supervision for accurate gland segmentation. In the course of training, it can further improve the discriminant ability of the middle feature by adding multitask regularization. Moreover, our network can not only output accurate probability map glands, but also for the separation of clusters of objects to draw a clear outline, which further promote the gland subdivision performance. This unified framework can
It is effective when applied to large-scale histopathological data and does not require additional steps to produce contours. Our approach to winning the 2015 Miccai Gland Division challenge, while exceeding all other methods of the 13 competitive teams is quite large.
1 Introduction
The most relevant research work is the design of a U-depth convolution network for biomedical image segmentation, and has won several recent major challenges [30]. In this paper, we propose a new deep contour perceptual network to solve this challenging problem. Our approach solves the three key problems of gland segmentation. First, our approach utilizes multi-level contextual features to represent effective glands segmentation in an end-to-end manner. Full use of a full convolution network, it can take an image as input, direct output probability diagram of a forward transmission. Therefore, this is a very high effect for large-scale histopathological image analysis. Secondly, because our approach does not make a hypothesis about the structure of the glands, it is easy to generalize to biopsies with different histopathology, including the grading of benign and malignant cases. In addition, rather than independently dealing with split tasks, our approach explores supplemental information, i.e. glands and contours, under a multitasking learning framework. Therefore, it can divide the glands at the same time and divide the cluster objects into separate objects, especially in benign cases where there are contact glands. The extensive experimental results of the 2015 MICCAI Gland Segmentation Challenge Benchmark DataSet confirmed the effectiveness of our approach, resulting in better performance than other advanced methods.
2 Method
In this section, we will describe in detail our proposed depth contour perception Network for accurate gland segmentation. We first introduce the full convolution network (FCN) for end-to-end training. In addition, we propose to use the multi-level context feature of the auxiliary supervision to generate a good gland likelihood probability graph. Then, by fusing the complementary information of the object and contour, the deep contour perceptual network is extracted from the FCN, and effective gland segmentation is performed. To mitigate the challenges of training data shortages, we use transfer learning to leverage the knowledge learned from other data domains to further improve performance in the field.
2.1 with multilevel contextual features FCN
The full convolution network realizes the latest performance of image segmentation related tasks [7,27]. This great success is largely attributable to the ability to characterize the dense classification. The entire network can be end-to-end trained (image to image) in a way that takes an image as an input and a direct output probability graph. The architecture basically contains two modules, including the down sampling path and the upper sampling path. The following sampling path includes convolution and maximum pool layer, which is widely used in the image Convolution neural network classification task [8,25]. The sample path above contains a convolution and a deconvolution layer (backward-stepping convolution [27]), on which the sample feature graph and the output fractional probability mask are sampled. The motivation behind this is that the sampling path is designed to extract advanced abstraction information, while the top sampling path predicts fractional map in pixels.
The FCN classification score is based on the intensity information from the given field of perception. However, a network with a single sensory field size does not correctly handle large changes in the shape of the glands. For example, as shown in Figure 1, a small receiving field (for example, 150x150) is suitable for normal glands in benign cases, and malignant cases usually require larger sensory fields, because the gland form of adenocarcinoma is degraded and elongated, so it can eliminate ambiguity, inhibit internal tubular structure and improve recognition performance. Therefore, based on FCN, we further advance by using multilevel contextual features that include different levels of contextual information, i.e. the intensity appears in a variety of different sizes of the feeling Nonaka. A schematic diagram of the FCN with multilevel contextual features can be seen in Figure 2. Specifically, the architecture of the neural network consists of a series of convolution layers, 5 maximum pooled layers for the bottom sampling and 3 deconvolution layers for upper sampling. With the network more and more deep, feel the size of the wild more and more. It is deduced that the upper sampling layer is a different field size which is designed to be considered. From the given acceptance domain, the sampled features are mapped and predicted according to the contextual hints. These predictions are then fused together through a summation operation and a final segmentation result based on the multilevel context feature, which is generated after Softmax.
The direct training of a network of such depths may be at the lowest point in the locality. Inspired by previous research.
Using the depth supervision training neural network [26,39,5], the weighted auxiliary classifier c1-c3 is added to the network to further enhance the training process, as shown in Figure 2. This can mitigate the problem of gradient disappearance, and the auxiliary supervised gradient encourages backward propagation of gradient flow. Finally, the FCN and multilevel context features extracted from the input I can
By minimizing the total loss L, that is, the auxiliary losses of WA with the corresponding weights La (I; W) and the data error loss between the predicted result and the real label Le (I; W): At this point the total cost function can be expressed in 1-way:
The first item is represented as a regular item that contains a parameter that is used to weigh the relationship between the other items.
2.2 Depth Profile Sensing network
Through the use of multi-level contextual features to assist the monitoring, the network can produce a good gland probability map. However, it is still difficult to rely solely on the likelihood of the glandular object to isolate the contact gland in the contacting area. This is rooted in the following sampling resulting in spatial information loss and feature abstraction. The nucleus of the boundary information formed by epithelium provides a good complementary clue for the separation. For this reason, we propose a deep contour perceptual network to divide the glands and separate the individual objects from the clusters.
An overview of the proposed depth profile sensing network is shown in Figure 3. Instead of dividing a glandular task as a separate and independent issue, it is a multitasking learning framework that infers the results of glands and contours by exploring the information that can be added. Specifically, the feature map is sampled to two different branches (green and blue arrows are shown in the diagram) to output the split mask glands and outlines. In each branch, the FCN predictive mask has multi-level contextual characteristics, as shown in section 2.1. In the training process, the parameters of the next sampling path WS are shared and the two tasks are performed together, and the parameters of the upper sampling layer of two independent branches (represented as WO and WC) are independently updated to infer the probability of the glandular object and contour respectively. Therefore, the layered structure can encode the information of the split object and the contour by means of the feature representation. Note that a multitasking network optimizes the end-to-end approach together. This joint multitasking learning process has several advantages. Firstly, it can increase the discriminant ability of the intermediate feature representation, and improve the robustness performance of the segmentation. Secondly, in the application of the gland segmentation, the Multitask Learning Framework can also provide a very good complementary contour information separately aggregated objects. This can significantly improve the segmentation performance at the target level, especially in benign histological images where there is often contact with glands in the object. This unified framework can be quite effective when dealing with large histopathological data. There is a forward propagation, which can produce the result of both the glandular object and the contour, rather than the 20,38 that generates the contour of the underlying thread by appending the separation step.
In the training process, the discount weight is reduced from the auxiliary classifier to the marginal value, and as the number of iterations increases, we simply reduce the items in the final loss. Finally, the network training is developed for each pixel classification, the specific formula as shown in 2:
The first item in Type 2 is L2, and the second is the loss of the glandular object, the third item is the loss of the contour of the gland; Finally, in order to produce the result graph, the gland object and contour information are fused by Formula 3, which results in the final segmentation result (the latter is mainly the morphological closed operation filling hole, separating the small cluster).
2.3 Transfer learning through a rich hierarchy of features
The weights of the layers in the sampling path under the pre training initialization are based on the Deeplab model, while the remaining layers are randomly initialized with Gaussian distributions. Then we fine-tune the end-to-end approach of the entire network to a random gradient drop in our medical tasks.
3 Experiments and results
31. Data and preprocessing
The training dataset consists of 85 images (benign/malignant = 37/48) and a label consisting of a true annotation provided by an expert pathologist. The test data consists of two parts: a section of the profile evaluation (60 images) and a partial B (20 images) on-site evaluation. In order to improve the robustness and reduce the fit, we used the data increase to enlarge the training dataset. Enhanced conversions include translations, rotational and elastic deformations (e.g., pillow shape and barrel distortion).
3.2 Complete Details
The input of the network is to randomly crop a 480x480 region from the original image as input and output a predictive mask of glands and contours. For the label of the contour, we extract the boundary from the connection component based on the pathologist's gland annotation, and then expand them using the element size 3 to inflate them.
3.3 Result Evaluations
Evaluation Criteria for segmentation
Criteria for evaluating boundary shapes
The results of the split show