TensorFlow for Migration Learning

Last Update:2018-08-02 Source: Internet

Author: User

Tags glob

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Inception-v3 model provided by Google Address: Https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip

Data Set Address: http://download.tensorflow.org/example_images/flower_photos.tgz

Code implementation:

#-*-Coding:utf-8-*-"" "Created on Wed Nov 8 09:56:12 @author: Hnu" "" Import glob import Os.path import Rando M import NumPy as NP import TensorFlow as TF from tensorflow.python.platform import Gfile # inception-v3 model number of nodes in the bottleneck layer BOTTLE
Neck_tensor_size = 2048 # INCEPTION-V3 The name of the tensor that represents the result of the bottleneck layer in the model.
# in Google's proposed inception-v3 model, the tensor name is ' pool_3/_reshape:0 '.
# When training a model, you can get the name of the tensor by Tensor.name.
Bottleneck_tensor_name = ' pool_3/_reshape:0 ' # The name that corresponds to the image input tensor. Jpeg_data_tensor_name = ' decodejpeg/contents:0 ' # download Google training good inception-v3 model file directory Model_dir = ' model/' # download Google training good inception -v3 model file name model_file = ' TENSORFLOW_INCEPTION_GRAPH.PB ' # because a training data is used multiple times, the original image can be stored in a file using a feature vector calculated from the INCEPTION-V3 model.
Eliminate duplication of calculations.
# The following variables define the storage address of these files.
Cache_dir = ' bottleneck/' # Picture Data folder.
# Each subfolder in this folder represents a category that needs to be differentiated, and each subfolder holds a picture of the corresponding category. Input_data = ' flower_data/' # verified data Percentage validation_percentage = 10 # test Data percent Test_percentage = 10 # define the settings for the neural network Learning_r
ATE = 0.01 STEPS = 4000 BATCH = 100 # This function reads all the picture lists from the Data folder and separates them by training, validation, test data. # tesThe Ting_percentage and Validation_percentage parameters specify the size of the test data set and the validation data set.
    def create_image_lists (Testing_percentage, Validation_percentage): # All the pictures you get are present in the Dictionary of Result (dictionary).
    # The Dictionary key is the name of the category, and value is a dictionary that stores all the image names in the dictionary. result = {} # gets all subdirectories under the current directory sub_dirs = [x[0] for x in Os.walk (Input_data)] # The first directory to get is the current directory, no need to consider is_root_d

        IR = True for sub_dir in sub_dirs:if Is_root_dir:is_root_dir = False Continue
        # get all valid picture files in the current directory. extensions = [' jpg ', ' jpeg ', ' jpg ', ' jpeg '] file_list = [] Dir_name = Os.path.basename (sub_dir) fo R extension in Extensions:file_glob = Os.path.join (Input_data, Dir_name, ' *. ') +extension) File_list.extend (Glob.glob (FILE_GLOB)) if not file_list:continue # Pass
        Gets the name of the category after the directory name. 
        Label_name = Dir_name.lower () # initializes the training dataset, test data set, and validation dataset for the current category training_images = [] Testing_images = [] Validation_images = [] for file_name in file_list:base_name = Os.path.basename (file_name) # randomly divide data into training data sets, test datasets
            and validation data sets. Chance = Np.random.randint (+) if chance < validation_percentage:validation_images.append (base_name) Elif Chance < (Testing_percentage + validation_percentage): Testing_images.appe
        nd (base_name) else:training_images.append (base_name) # puts the current category's data into the result dictionary. Result[label_name] = {' dir ': dir_name, ' training ': training_images, ' testing ': Testin G_images, ' Validation ': Validation_images} # Returns all data sorted return result # This function is passed the class name, the owning data
Set and picture number to get the address of a picture.
# The Image_lists parameter gives all the picture information. # The Image_dir parameter gives the root directory.
The root directory where the image data is stored is different from the root directory where the image feature vectors are stored.
# The Label_name parameter is given the name of the category.
# The index parameter is given the number of the picture you want to get.
# The category parameter specifies whether the picture you want to get is the training data set, the test dataset, or the validation dataset. def get_image_path (image_lists, Image_dir, Label_name, index, CategorY): # Gets information about all the pictures in a given category.
    Label_lists = Image_lists[label_name] # gets all the picture information in the collection based on the name of the owning dataset.
    Category_list = label_lists[category] Mod_index = index% len (category_list) # Gets the file name of the picture. Base_name = Category_list[mod_index] Sub_dir = label_lists[' dir ') # The final address is the address of the data root directory + folder of the category + picture name Full_path = Os.path.join (Image_dir, Sub_dir, Base_name) return Full_path # This function obtains a feature vector file that has been processed by the INCEPTION-V3 model through the category name, the owning dataset, and the picture number
Access. def get_bottlenect_path (image_lists, Label_name, Index, category): Return Get_image_path (Image_lists, Cache_dir, label


_name, index, category) + '. txt ';
# This function uses the loaded INCEPTION-V3 model to process a picture to get the eigenvector of the image. def run_bottleneck_on_image (Sess, Image_data, Image_data_tensor, bottleneck_tensor): # This process is actually the value of calculating the bottleneck tensor as input for the current picture.
    The value of this bottleneck tensor is the new eigenvector of this image. Bottleneck_values = Sess.run (Bottleneck_tensor, {image_data_tensor:image_data}) # The result of convolutional neural network processing is a four-dimensional array, This result needs to be compressed into a eigenvector (one-dimensional array) bottleneck_values = Np.squeeze (bottleneck_values) return BoTtleneck_values # This function gets a picture of a feature vector that has been processed by the INCEPTION-V3 model.
# This function will first try to find the feature vectors that have been computed and saved, and if not, calculate the eigenvector first and then save to the file. def get_or_create_bottleneck (Sess, image_lists, Label_name, Index, category, Jpeg_data_tensor, Bottleneck_tensor): # won
    Take a picture of the path of the eigenvector file corresponding to the map. 
    label_lists = Image_lists[label_name] Sub_dir = label_lists[' dir '] Sub_dir_path = Os.path.join (Cache_dir, Sub_dir) If not os.path.exists (Sub_dir_path): Os.makedirs (sub_dir_path) Bottleneck_path = Get_bottlenect_path (imag
    E_lists, Label_name, Index, category) # If the eigenvector file does not exist, the eigenvectors are computed by the INCEPTION-V3 model and the results are stored in the file. If not os.path.exists (Bottleneck_path): # Gets the original picture path image_path = Get_image_path (image_lists, Input_data, L
        Abel_name, index, category) # get the picture content. Image_data = Gfile. Fastgfile (Image_path, ' RB '). Read () # print (len (image_data)) # Because the size of the imported picture is inconsistent, the resulting image_data size is also inconsistent (verified), but can A 2048 eigenvector is generated by the loaded INCEPTION-V3 model.
        The specific principle is unknown. # Compute feature vector bot with INCEPTION-V3 modelTleneck_values = Run_bottleneck_on_image (Sess, Image_data, Jpeg_data_tensor, Bottleneck_tensor) # Deposit the computed eigenvectors into the file bottleneck_string = ', '. Join (str (x) for x in Bottleneck_values) with open (Bottleneck_path, ' W ') as Bottlene
        Ck_file:bottleneck_file.write (bottleneck_string) Else: # gets the corresponding eigenvector of the image directly from the file. With open (Bottleneck_path, ' R ') as Bottleneck_file:bottleneck_string = Bottleneck_file.read () bottlen Eck_values = [Float (x) for x in Bottleneck_string.split (', ')] # returns the resulting eigenvector return Bottleneck_values # This function randomly acquires a B
Atch's pictures as training data. def get_random_cached_bottlenecks (Sess, n_classes, image_lists, How_many, category, JPEG _data_tensor, bottleneck_tensor): bottlenecks = [] Ground_truths = [] for _ in range (How_many): # Random One
        The number of categories and pictures is added to the current training data. Label_index = Random.randrange (n_classes) label_name = List (Image_lists.keys ()) [Label_index] Image_inDex = Random.randrange (65536) bottleneck = Get_or_create_bottleneck (Sess, image_lists, Label_name, Image_index, CA Tegory, Jpeg_data_tensor, bottleneck_tensor) Ground_truth = Np.zeros (N_classes, dtype=np.float32) Ground_truth[label_index] = 1.0 bottlenecks.append (bottleneck) Groun D_truths.append (Ground_truth) return bottlenecks, Ground_truths # This function gets all the test data.
In the final test it is necessary to calculate the correct rate on all the test data. def get_test_bottlenecks (Sess, image_lists, n_classes, Jpeg_data_tensor, bottleneck_tensor): bottlenecks = [] Grou
    Nd_truths = [] label_name_list = List (Image_lists.keys ()) # Enumerates all the categories and test pictures in each category. For Label_index, Label_name in Enumerate (label_name_list): Category = ' testing ' for index, Unused_base_nam
            E in enumerate (Image_lists[label_name][category]): # calculates the corresponding eigenvector of a picture through the INCEPTION-V3 model and adds it to the list of final data. Bottleneck = Get_or_create_bottleneck (Sess, image_lists, Label_name, index, category, Jpeg_data_tensor, Bottleneck_tensor) Grou Nd_truth = Np.zeros (n_classes, dtype = np.float32) Ground_truth[label_index] = 1.0 Bottlenecks.app End (bottleneck) ground_truths.append (Ground_truth) return bottlenecks, Ground_truths def Main (_): #
    Read all pictures. Image_lists = Create_image_lists (Test_percentage, validation_percentage) n_classes = Len (Image_lists.keys ()) # Read
    A well-trained inception-v3 model.
    # Google's well-trained model is stored in the Graphdef Protocol buffer, which holds the calculation method of each node value and the value of the variable.
    # The problem of TensorFlow model persistence is described in detail in the 5th chapter. With Gfile. Fastgfile (Os.path.join (Model_dir, Model_file), ' RB ') as F:graph_def = tf. Graphdef () graph_def.
    Parsefromstring (F.read ()) # Loads the read INCEPTION-V3 model and returns the tensor corresponding to the data input and calculates the corresponding amount of the bottleneck layer result. Bottleneck_tensor, jpeg_data_tensor = Tf.import_graph_def (Graph_def, Return_elements=[bottleneck_tensor_name, JPEG_ Data_tensor_name]) # define a new neural network input, which is the new graphThe node value of the piece passes through the INCEPTION-V3 model to reach the bottleneck layer before propagating.
    # This process can be understood as a feature extraction. Bottleneck_input = Tf.placeholder (Tf.float32, [None, Bottleneck_tensor_size], name= ' bottleneckinputplaceholder ') # definition New Standard answer input ground_truth_input = Tf.placeholder (Tf.float32, [None, n_classes], name= ' Groundtruthinput ') # define a layer of fully connected layers to solve
    New picture classification problem.
    # because the well-trained INCEPTION-V3 model has abstracted the original images for more easily categorized eigenvectors, there is no need to train so complex neural networks to complete this new classification task. With Tf.name_scope (' Final_training_ops '): weights = tf. Variable (Tf.truncated_normal ([Bottleneck_tensor_size, n_classes], stddev=0.001)) biases = tf. Variable (Tf.zeros ([n_classes])) Logits = Tf.matmul (bottleneck_input, weights) + biases final_tensor = TF.N N.softmax (logits) # defines the cross-entropy loss function cross_entropy = tf.nn.softmax_cross_entropy_with_logits (Logits=logits, Labels=ground _truth_input) Cross_entropy_mean = Tf.reduce_mean (cross_entropy) train_step = Tf.train.GradientDescentOptimizer (LE arning_rate). Minimize (Cross_entropy_mean) # Calculate the correct rate with TF.NAme_scope (' evaluation '): Correct_prediction = Tf.equal (Tf.argmax (final_tensor, 1), Tf.argmax (Ground_truth_input, 1 )) Evaluation_step = Tf.reduce_mean (Tf.cast (correct_prediction, Tf.float32)) with TF.
            Session () as Sess:tf.global_variables_initializer (). Run () # Training process for I in range (STEPS): # Every time you get a batch of training data train_bottlenecks, Train_ground_truth = Get_random_cached_bottlenecks (s ESS, N_classes, image_lists, BATCH, ' training ', Jpeg_data_tensor, Bottleneck_tensor) sess.run (Train_step, Feed
            _dict={bottleneck_input:train_bottlenecks, Ground_truth_input:train_ground_truth}) # test the correct rate on the validation set. If i%100 = = 0 or i+1 = = steps:validation_bottlenecks, Validation_ground_truth = Get_random_cached_bottl
                Enecks (Sess, n_classes, image_lists, BATCH, ' validation ', Jpeg_data_tensor, Bottleneck_tensor) Validation_accuracy = Sess.run (EvaluatiOn_step, feed_dict={bottleneck_input:validation_bottlenecks, ground_truth_input:validation_ground_tr Uth}) Print (' Step%d:validation accuracy on random sampled%d examples =%.1f%% '%
        (I, BATCH, validation_accuracy*100))
                                                                       # test the correct rate on the final test data test_bottlenecks, Test_ground_truth = Get_test_bottlenecks (Sess, image_lists, N_classes, Jpeg_data_tensor, Bottleneck_tensor) test_accuracy
                                                                 = Sess.run (Evaluation_step, Feed_dict={bottleneck_input:test_bottlenecks, Ground_truth_input:test_ground_truth}) Print (' Final test accuracy =%.1f%% '% (test_accuracy * ) If __name__ = = ' __main__ ': Tf.app.run ()

The final run results are:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More