Cnn
CV Tasks
Classification Classification + Localization
CLASSIFICATION:C classes
Input:image
Output:class Label
Evaluation Metric:accuracy
Localization
Input:image
Output:box in the image (X,y,w,h)
Evaluation Metric:intersection over Union method one: Positioning as a regression problem
Mark Y value as box position, neural network output as box position, use L2 distance as loss function
Simple method:
1. Download existing classified network alexnet or VGG network, etc.
2. The entire network of the first used for classification of the full connection, Softmax and other head layer, linked into our regression head layer.
3. Use of SGD and L2loss,train regression head during training
4. When testing, classification head and regression head are used
When locating multiple objects, output k*4 a position point, K (X,y,w,h) method two: Sliding window
Multiple positioning on a high-definition image using the Classification+regression network structure
Conversion of full connection layer for high efficiency calculation of volume layers
Combining all the different size positioning results to get the final forecast value
3. Object Detection
Problem with detection as a classification: many different locations and sizes need to be tested
R-CNN (regions with convolutional neural network Features
Training methods
1. Download existing classified network alexnet or VGG network, etc.
2. The fine-tuning model is used for testing to remove the final full connection layer and reinitialize
3. Extract all possible areas, for each area, converted to CNN input size, after the CNN forward propagation, Save POOL5 features
Use SGD and L2loss,train regression head during training
4. Establish 0/1 categories of SVM for each category in the test
5.bbox regression, for each category, establish linear regression model mappings from cached features to Gtboxs
The problems that exist
1. Testing is slow, need to run all the recommended areas of CNN for forward Propagation
2.SVM and regression are causal inversion, and CNN features are not updated in SVM and regression
3. The training process is very complex
Fast R-CNN
Workaround area Recommended network Region Proposal Network and Detection network share full graph convolution characteristics, the whole network training once
1. Map the suggested area to the entire graph's convolution
2. Partitioning the proposed area into a h*w grid
3. Maximum pooling in each grid
4. Reverse propagate the maximum pool back to the previous step
Faster R-CNN