Image Classification | Deep Learning PK Traditional machine learning

Source: Internet
Author: User
Tags svm


Original: Image classification in 5 Methods
Author: Shiyu Mou
Translation: He Bing Center

Image classification, as the name suggests, is an input image, output to the image content classification of the problem. It is the core of computer vision, which is widely used in practice.

The traditional method of image classification is feature description and detection, such traditional methods may be effective for some simple image classification, but the traditional classification method is overwhelmed because of the complicated situation. Now, instead of trying to use code to describe each image category, we decided to switch to machine learning to deal with image classification problems.

At present, many researchers use the CNN and other depth learning models for image classification, in addition, the classical KNN and SVM algorithm also achieved good results. However, it seems impossible to say which method works best for the image splitting problem.

In this project, we've done some interesting things: comparing CNN and migration Learning algorithms that are commonly used in the industry for image classification with KNN,SVM,BP neural networks. Gain deep learning experience. Explore Google's machine learning framework TensorFlow.

Below is the detailed implementation details. System Design

In this project, 5 algorithms for experiments are KNN, SVM, BP Neural Network, CNN and Migration Learning. We used the following three ways to experiment KNN, SVM, BP Neural network is what we can learn in school. Powerful and easy to deploy. So the first step, we mainly use Sklearn to realize KNN,SVM, and BP neural network. Because the traditional multilayer perceptron model has a good effect on image recognition, the recognition rate is not ideal for high-resolution images because of the blocking of the whole connection mode between nodes. So this step, we build CNN with the Google TensorFlow framework. The inception V3 of the Deep neural network has been trained. Inception V3 is provided by TensorFlow and is trained using data from the Imagenet since 2012. Imagenet is a classic challenge in the field of computer vision, where contestants try to use the model to put all the images into 1000 categories. In order to train the model that has been trained well, we must ensure that our own dataset is not trained. Implement

The first approach: using Sklearn preprocessing data and implementing KNN,SVM and BP neural networks. Step 1, using the OPENCV package, defines 2 preprocessing functions, namely the image eigenvector (used to resize the image and flatten the image into a series of row pixels) and extract the color histogram (using Cv2.normalize to extract a 3D color histogram from the HSV gamut and smoothing it). Step 2, construct the parameters. Since we are trying to perform performance testing on the entire dataset and on the different classes of cubes, we look at each dataset as a parameter for the experimental analysis. In addition, we set the number of neighbors in KNN as parameters. Step 3, extract the image features and write to the array. We use the Cv2.imread function to read images and classify them according to the normalized image names. Then run the 2 functions mentioned in step 1, get 2 image features and write to the array respectively. Step 4, use the function Train_test_split to split the dataset. 85% of the data as a training set, 15% of the data as a test set. Step 5, use the KNN,SVM and BP neural network method to evaluate the data. For KNN, using Kneighborsclassifier, for SVM, using SVC, for BP Neural network, using Mlpclassifier.

The second approach: Build CNN based on TensorFlow. Using TensorFlow to get a calculation diagram and implement it in C + + is more efficient than python.

Several concepts used in TensorFlow: placeholders, variables, mathematical formulas, cost metrics, best practices, CNN architecture. Step 1, place the image on the first layer. Step 2, build 3-layer convolution layer (3 convolutional layers), 2x2 max-pooling and Relu. Input is 4 dimension tensor: "image number, y coordinate, x coordinate, channel". The output is another processed 4-D tensor: "Image number (invariant), y-coordinate, x-coordinate, channel". Step 3, build the 2-layer full connection layer (2 fully-connected Layers). The input is 2 dimension tensor: "Image number, enter Number". The output is 2 D tensor "image number, output number". Using step 4, use the merge layer (flatten Layer) to link the convolution layer and the full join layer. Step 5, standardize the output using Softmax layer. Step 6, optimize the training results. We use the cross entropy (cross entropy) as the cost measurement function and take its mean value. The best method is to use Tf.train.AdamOptimizer ().

The third method: Retrain Inception V3. Use retrain Inception V3 and use migration learning to reduce workload.

We get the pre-trained model, remove the original top layer, and train the new model. It then analyzes all the images on the disk and calculates their bottleneck values. The script will run 4,000 times. Each run randomly picks 10 images from the training set, finds their bottleneck values and injects the last layer to get the predicted results. Then, in the process of reverse propagation, the weights of each layer are updated according to the comparison result of the forecast result and the actual label. Experiment

The data set used in the experiment is OXFORD-IIIT Pet data set. There are 25 types of dogs, Cat class 12. Each class has 200 images. We use data from 10 categories of cats in the dataset, respectively, [' Sphynx ', ' Siamese ', ' Ragdoll ', ' Persian ', ' maine-coon ', ' british-shorthair ', ' Bombay ', ' Birman ', ' Bengal ', ' Abyssinian '. That is, there are 2000 images, because the size of the image is different, we resize unified to a fixed size 64x64 or 128x128.

In this project, we mainly use OpenCV preprocessing images. The training set is usually randomly processed by deformation, clipping or brightness. GitHub Assign Value

The first method: KNN,SVM, and BP neural network

The first part: using Sklearn preprocessing data and realizing KNN,SVM and BP neural network. In the Image_to_feature_vector function, we set the size 128x128. The results show that the larger the size of the image, the more accurate the result and the larger the operating burden. Finally we decided to use the 128x128 size. In the Extract_color_histogram function, set the number of containers per channel to be 32,32,32. for datasets, 3 sets of data are used. The first is a child dataset with 400 images and 2 labels. The second is a child dataset with 1000 images and 5 labels. The third is the entire dataset, 1997 images, and 10 labels. In Kneighborsclassifier, we only change the number of neighbors and store the result as the best K value for each dataset, and the other parameters are default. In Mlpclassifier, we set up 50 neurons per layer. In Svc, the maximum number of iterations is 1000, and the class weight is "balanced".

Depending on the dataset, 2 labels to 10 labels are different, running at approximately 3-5 minutes.

The second approach: building CNN based on TensorFlow

Because of the long running time in the entire dataset, we are batch-processed in each iteration. Each batch typically has 32 or 64 images. The dataset is divided into 1600-image training sets, 400-image validation sets, and 300-image test sets.

A large number of parameters can be adjusted in this method. The learning rate is set to 1x10^-4, the image size is set to 64x64 and 128x128, then the layers and shapes, but there are too many parameters to adjust, and we experiment to get the best results.

In order to get the best layers, we carried out the experiment. First, the parameters are as follows:

# convolutional Layer 1. Filter_size1 = 5 Num_filters1 =
convolutional Layer 2 filter_size2 = 5 num_filters2 =
# convolutional Lay ER 3. Filter_size3 = 5 Num_filters3 = 128
# fully-connected layer 1 fc1_size = 256 # fully-connected layer
2. fc1_size = 256

We used 3 convolution layers and 2 fully connected layers, but the tragedy was over fitting. It is found that our data sets are too small and the network is too complex for this structure.

Finally, we use the following parameters:

# convolutional Layer 1. Filter_size1 = 5 Num_filters1 =
convolutional Layer 2 filter_size2 = 3 Num_filters2 =
# fully-connected L Ayer 1. fc1_size = 128
# Number of neurons in fully-connected layer.
# fully-connected Layer 2. fc2_size = 128
# Number of neurons in fully-connected layer.
# Number of color channels for the Images:1 channel for Gray-scale. Num_channels = 3

We use only 2 convolution layers and 2 fully connected layers. Still unsatisfactory, after 4,000 iterations, the results are still fitting, but fortunately the test results are 10% better than the former. Finally, after 5,000 iterations, we get 43% of the accuracy, the running time is more than half an hour.

PS: We experimented with another dataset, CIFAR-10.

The dataset contains 60,000 32x32 color images, divided into 10 categories, 6,000 images per category. Training set of 50,000 images, test set 10,000 images. Using the same network structure, after 10 hours of training, the final 78% accuracy.

The third method: Retrain Inception V3

Similar to the above method, the number of training is 4000, adjusted according to the results. The learning rate is adjusted according to the number of images per batch. 80% of the data is used for training, 10% for validation, and 10% for testing. Experimental Results

The first method: KNN,SVM, and BP neural network

In Knn, the value of knn_raw_pixel and knn_histo accuracy is relatively close. In the case of Category 5 labels, the former is lower than the latter, and the original pixel performance is better overall.

In the MLP classifier, the original pixel is precisely below the histogram precision. For the entire dataset (10 tags), the original pixel accuracy is lower than the accuracy of the random conjecture.

Both of the above Sklearn methods do not get very good performance. For the entire dataset, there is only 24% accuracy. Experimental results show that the Sklearn method can not effectively classify images. In order to effectively classify images and improve accuracy, it is necessary to use the method of depth learning.

The second approach: building CNN based on TensorFlow

Because of the fitting, we can't get good experimental results. Run time is generally half an hour, due to the fitting, we believe that the running time is not predictable. By comparison with Method 1, it can be concluded that even if CNN had fitted the training set, the experimental results were still superior to Method 1.

The third method: Retrain Inception V3

The whole training process was not more than 10 minutes, and we got very good results. It turns out that deep learning and migration learning is very strong.

Demo:

Conclusions

Based on the above experiments, we conclude that: KNN,SVM and BP neural networks are not effective in image classification. Even on CNN, the results of CNN's experiments are still better than the traditional algorithms. Migration learning is very effective in the image classification problem. The operation time is short and the result is accurate, can solve the problem of fitting and data set too small well.

Through this project, we have gained a lot of valuable experience, as follows: Adjust the image to make it smaller. For each iteration of the training, randomly select small batches of data. Randomly select small batches of data as validation set for validation, and feedback validation scores during training. The image augmentation is used to transform the set of input images into a larger new dataset that can be adjusted. Image datasets are larger than 200x10. A complex network structure requires more training sets. Be careful about fitting.

Reference documents
1. cs231n convolutional neural Networks for Visual recognition
2. TensorFlow convolutional Neural Networks
3. How to Retrain Inception ' s Final Layer for New Categories
4. K-nn Classifier for image classification
5. Image augmentation for Deep Learning with Keras
6. convolutional Neural network TensorFlow Tutorial

Note:the the performed by Ji Tong
Https://github.com/JI-tong
Originally published at Gist.github.com.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.