3D U-net Paper Analysis

Last Update:2018-08-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

3D U-net The birth of this paper is mainly to deal with some block diagram (volumetric images), the basic principle and u-net in fact there is no big difference, because 3D u-net is the 3D convolution operation replaced 2D, However, in this blog post I will describe the structure of the paper about the overall principle and structure of the application. Of course, in the original paper, the author in order to confirm the framework of the enforceability and the final results of the experiment, is set up two different sets of experiments, one is semi-automatic setup (ie: artificial use of the algorithm to the 3D image of some slices (slices) to label and then put into the model to run the program), The other case is that the author assumes a representative, sparse annotated training set is the existence of the data input model directly to End-to-end training, this part of the processing I will not be described in detail, interested friends or small partners please click on the following link to download reading comprehension, The structure and characteristics of the 3D u-net are mainly described in the blog post.

Paper Address: https://arxiv.org/abs/1606.06650

1. Introduction (Introduction)

Biomedical Imaging (biomedical images) is often blocky, which means that many slices make up the existence of an entire picture. If the 2D image processing model to deal with 3D itself is not not possible, but there is a problem, is to have a picture of biomedical images of a slice a slice group (including training data and labeled good data) into the design of the model to train, In this case there will be an efficiency problem, so many times when the block diagram will be uncomfortable, and the data preprocessing method is relatively cumbersome (tedious).

Therefore, the author of the paper put forward the 3d-net model, the model not only solves the problem of efficiency, and for the block diagram of the cut only requires that some of the data in the section is labeled (can refer to the description).

2. Model structure (Network Architecture)

The entire 3D u-net model is based on the previous u-net (2D), which also contains a encoder section and a decoder section, which is used to analyze the entire image and feature extraction and analysis. The corresponding decoder part is to create a segmented block diagram. The size of the input image used in the paper is 132 * 132 * 116, the first half of the structure of the entire network (analysis path) contains and uses the following convolution operations:

A. Each layer of neural networks contains two convolution of 3 * 3 * 3 (convolution)

B. Batch normalization (in order for the network to converge better convergence)

C. ReLU

D. downsampling:2 * 2 * 2 max_polling, step stride = 2

The corresponding synthetic path (synthesis path) performs the following actions:

A. Upconvolution:2 * 2 * 2, step =2

B. Two normal convolution operations: 3 * 3 * 3

C. Batch Normalization

D. ReLU

E. At the same time, the result of the corresponding network layer in analysis path needs to be input as part of the decoder, and this is done in the same way as the U-net blog mentions, in order to collect the high-pixel feature information that is preserved in the feature analyses so that the image can be better synthesized.

The overall network structure as shown, in fact, can be seen with the 2D structure of the u-net is basically the same, the only difference is that all 2D operation replaced by the line, after doing so, for volumetric image does not need to enter each slice separately for training, Instead, you can take the whole picture as input into the model (PS: But when the image is too large, you need to use the random crop technique to randomly cut the image into a fixed size module of the image into the building model for training, of course, this is something, and will be introduced in other articles). In addition, one of the highlights of the paper is that 3D u-net uses the weighted Softmax loss function to set unlabeled pixels to 0 so that the network can learn more about only the pixel points that are labeled, thus achieving universal characteristics.

3. Training details (Training)

The 3D U-net also uses the data augmentation method, mainly by rotation,scaling , and setting the image to Gray, At the same time, a smooth dense deformation field (smooth dense deformation field)is applied to the training data and the data that is actually labeled, mainly by selecting a grid with a standard deviation of 4 from a random vector sample of a normal distribution. With 32 voxel spacing in each direction, and then applying B-spline interpolation (b-spline interpolation, I don't know what the B-spline interpolation can be viewed in a point-of-view way, and sometimes it doesn't need to be that complicated in the creation of deep learning models, so this is only understood, Unless the mathematical foundation itself is well understood), the B-spline interpolation method is generally used to find a similar shape on the original shape to approximate (approximation). The data is then trained using the weighted cross-entropy loss (weighted cross-entropy loss function) to reduce the weight of the background and increase the weight of the portion of the image data that is labeled to balance the loss of the tubular and background voxel.

The result of the experiment is measured by IOU (intersection over Union), which is a comparison of the overlapping portions of the generated image with the actual part being labeled. The specific formula for the IOU of the 3D u-net is as follows:

IoU = True positives/(true positives + Falese negatives + False positives)

4. Summary

In this paper, the results of segmentation of renal biomedical images are iou=86.3%. The birth of 3D u-net in medical image segmentation, especially those volumetric images are very helpful, because it largely solves 3D images slice into the model to train the awkward situation, but also greatly improve training efficiency, and retains the outstanding characteristics of FCN and u-net. Here attach a person to see a good code implementation (non-original author, the original is implemented with Caffe): Https://github.com/ellisdg/3DUnetCNN

3D U-net Paper Analysis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

3D U-net Paper Analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

3D U-net Paper Analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support