Describes how tensorflow trains its own dataset to implement CNN image classification, tensorflowcnn

Last Update:2018-02-11 Source: Internet

Author: User

Tags shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Describes how tensorflow trains its own dataset to implement CNN image classification, tensorflowcnn

Training image data using convolutional neural networks involves the following steps:

1. Read image files
2. Generate a batch for training
3. Define the Training Model (including initialization parameters, convolution, pooling layer, and other parameters and networks)
4. Training

1. Read image files

def get_files(filename):  class_train = []  label_train = []  for train_class in os.listdir(filename):    for pic in os.listdir(filename+train_class):      class_train.append(filename+train_class+'/'+pic)      label_train.append(train_class)  temp = np.array([class_train,label_train])  temp = temp.transpose()  #shuffle the samples  np.random.shuffle(temp)  #after transpose, images is in dimension 0 and label in dimension 1  image_list = list(temp[:,0])  label_list = list(temp[:,1])  label_list = [int(i) for i in label_list]  #print(label_list)  return image_list,label_list

Here, the file name is used as the tag, that is, the category (the data type must be determined, and the data to be converted to the tensor type later ).

Then convert the image and label to the list format data, because some tensorflow functions later use the list format data.

2 generate a batch for training

def get_batches(image,label,resize_w,resize_h,batch_size,capacity):  #convert the list of images and labels to tensor  image = tf.cast(image,tf.string)  label = tf.cast(label,tf.int64)  queue = tf.train.slice_input_producer([image,label])  label = queue[1]  image_c = tf.read_file(queue[0])  image = tf.image.decode_jpeg(image_c,channels = 3)  #resize  image = tf.image.resize_image_with_crop_or_pad(image,resize_w,resize_h)  #(x - mean) / adjusted_stddev  image = tf.image.per_image_standardization(image)    image_batch,label_batch = tf.train.batch([image,label],                       batch_size = batch_size,                       num_threads = 64,                       capacity = capacity)  images_batch = tf.cast(image_batch,tf.float32)  labels_batch = tf.reshape(label_batch,[batch_size])  return images_batch,labels_batch

First, convert tf. cast to the tensorflow data format, and use tf. train. slice_input_producer to implement an input queue.

The label does not need to be processed. The image stores the path and needs to be read as an image. The next step is to convert the read path to the image for training.

CNN is sensitive to the image size. The size of 10th rows of images is the same as that of resize, and the value of 12 rows is standardized, that is, the mean value of all images is subtracted to facilitate training.

Next, use the tf. train. batch function to generate training batches.

Finally, you can convert the Data Type of the batch and process the shape to generate the batch used for training.

3. Define the Training Model

(1) Definition and initialization of training parameters

def init_weights(shape):  return tf.Variable(tf.random_normal(shape,stddev = 0.01))#init weightsweights = {  "w1":init_weights([3,3,3,16]),  "w2":init_weights([3,3,16,128]),  "w3":init_weights([3,3,128,256]),  "w4":init_weights([4096,4096]),  "wo":init_weights([4096,2])  }#init biasesbiases = {  "b1":init_weights([16]),  "b2":init_weights([128]),  "b3":init_weights([256]),  "b4":init_weights([4096]),  "bo":init_weights([2])  }

Each layer of CNN is a decision model of y = wx + B. The convolution layer generates feature vectors and carries them into x for calculation. Therefore, you need to define the initialization parameters of the convolution layer, including weight and offset. The parameter shapes of the second row are explained later.

(2) define operations at different layers

 def conv2d(x,w,b):  x = tf.nn.conv2d(x,w,strides = [1,1,1,1],padding = "SAME")  x = tf.nn.bias_add(x,b)  return tf.nn.relu(x)def pooling(x):  return tf.nn.max_pool(x,ksize = [1,2,2,1],strides = [1,2,2,1],padding = "SAME")def norm(x,lsize = 4):  return tf.nn.lrn(x,depth_radius = lsize,bias = 1,alpha = 0.001/9.0,beta = 0.75)

Only three layers are defined, namely the convolution layer, pooling layer, and regularization layer.

(3) define the Training Model

def mmodel(images):  l1 = conv2d(images,weights["w1"],biases["b1"])  l2 = pooling(l1)  l2 = norm(l2)  l3 = conv2d(l2,weights["w2"],biases["b2"])  l4 = pooling(l3)  l4 = norm(l4)  l5 = conv2d(l4,weights["w3"],biases["b3"])  #same as the batch size  l6 = pooling(l5)  l6 = tf.reshape(l6,[-1,weights["w4"].get_shape().as_list()[0]])  l7 = tf.nn.relu(tf.matmul(l6,weights["w4"])+biases["b4"])  soft_max = tf.add(tf.matmul(l7,weights["wo"]),biases["bo"])  return soft_max

The model is relatively simple. layer-3 convolution is used, and 11th rows use full join. The feature vector needs to be reshaped. The shape of l6 is a [-1, 1st dimension parameter of w4]. Therefore, when "w4" is reshaped, the size of-1 must be set to batch_size. In this way, when "wo" is multiplied, the final output size is [batch_size, class_num].

(4) define the evaluation volume

Def loss (logits, label_batches): cross_entropy = tf. nn. sparse_softmax_cross_entropy_with_logits (logits = logits, labels = label_batches) cost = tf. performance_mean (cross_entropy) return cost first defines the loss function, which is required for training to minimize the loss def get_accuracy (logits, labels): acc = tf. nn. in_top_k (logits, labels, 1) acc = tf. cast (acc, tf. float32) acc = tf. performance_mean (acc) return acc

The amount of classification accuracy is evaluated. during training, the value of loss needs to be reduced, and the accuracy is increased. This training is converged.

(5) define training methods

 def training(loss,lr):   train_op = tf.train.RMSPropOptimizer(lr,0.9).minimize(loss)   return train_op

There are many training methods that you can view on the official website. However, different training methods may have different parameter definitions and need to be processed separately. Otherwise, an error may be reported.

4. Training

def run_training():  data_dir = 'C:/Users/wk/Desktop/bky/dataSet/'  image,label = inputData.get_files(data_dir)  image_batches,label_batches = inputData.get_batches(image,label,32,32,16,20)  p = model.mmodel(image_batches)  cost = model.loss(p,label_batches)  train_op = model.training(cost,0.001)  acc = model.get_accuracy(p,label_batches)    sess = tf.Session()  init = tf.global_variables_initializer()  sess.run(init)    coord = tf.train.Coordinator()  threads = tf.train.start_queue_runners(sess = sess,coord = coord)    try:    for step in np.arange(1000):      print(step)      if coord.should_stop():        break      _,train_acc,train_loss = sess.run([train_op,acc,cost])      print("loss:{} accuracy:{}".format(train_loss,train_acc))  except tf.errors.OutOfRangeError:    print("Done!!!")  finally:    coord.request_stop()  coord.join(threads)  sess.close()

During neural network training, we need to save the model so that we can continue training later or use the trained model for testing. Therefore, we need to create a saver to save the model.

def run_training():  data_dir = 'C:/Users/wk/Desktop/bky/dataSet/'  log_dir = 'C:/Users/wk/Desktop/bky/log/'  image,label = inputData.get_files(data_dir)  image_batches,label_batches = inputData.get_batches(image,label,32,32,16,20)  print(image_batches.shape)  p = model.mmodel(image_batches,16)  cost = model.loss(p,label_batches)  train_op = model.training(cost,0.001)  acc = model.get_accuracy(p,label_batches)    sess = tf.Session()  init = tf.global_variables_initializer()  sess.run(init)  saver = tf.train.Saver()  coord = tf.train.Coordinator()  threads = tf.train.start_queue_runners(sess = sess,coord = coord)    try:    for step in np.arange(1000):      print(step)      if coord.should_stop():        break      _,train_acc,train_loss = sess.run([train_op,acc,cost])      print("loss:{} accuracy:{}".format(train_loss,train_acc))      if step % 100 == 0:        check = os.path.join(log_dir,"model.ckpt")        saver.save(sess,check,global_step = step)  except tf.errors.OutOfRangeError:    print("Done!!!")  finally:    coord.request_stop()  coord.join(threads)  sess.close()

The trained model information is recorded in the checkpoint file, which is roughly as follows:

Model_checkpoint_path: "C:/Users/wk/Desktop/bky/log/model. ckpt-100"
All_model_checkpoint_paths: "C:/Users/wk/Desktop/bky/log/model. ckpt-0"
All_model_checkpoint_paths: "C:/Users/wk/Desktop/bky/log/model. ckpt-100"

Other files will be generated to record the model parameters and other information respectively. During subsequent tests, the program will read the checkpoint file to load these real data files.

After a neural network is built and trained, if you use the previous code for direct testing, an incorrect shape error will be reported, which is roughly because the input of the convolution layer is inconsistent with the shape of the image, this is because the code in the previous article defines weights and biases outside the model, and valueError occurs when the model is called.

Therefore, we need to define the parameters in the model. when loading the trained model parameters, the trained parameters can truly initialize the model. The rewrite model function is as follows:

def mmodel(images,batch_size):  with tf.variable_scope('conv1') as scope:    weights = tf.get_variable('weights',                  shape = [3,3,3, 16],                 dtype = tf.float32,                  initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))    biases = tf.get_variable('biases',                  shape=[16],                 dtype=tf.float32,                 initializer=tf.constant_initializer(0.1))    conv = tf.nn.conv2d(images, weights, strides=[1,1,1,1], padding='SAME')    pre_activation = tf.nn.bias_add(conv, biases)    conv1 = tf.nn.relu(pre_activation, name= scope.name)  with tf.variable_scope('pooling1_lrn') as scope:    pool1 = tf.nn.max_pool(conv1, ksize=[1,2,2,1],strides=[1,2,2,1],                padding='SAME', name='pooling1')    norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001/9.0,             beta=0.75,name='norm1')  with tf.variable_scope('conv2') as scope:    weights = tf.get_variable('weights',                 shape=[3,3,16,128],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))    biases = tf.get_variable('biases',                 shape=[128],                  dtype=tf.float32,                 initializer=tf.constant_initializer(0.1))    conv = tf.nn.conv2d(norm1, weights, strides=[1,1,1,1],padding='SAME')    pre_activation = tf.nn.bias_add(conv, biases)    conv2 = tf.nn.relu(pre_activation, name='conv2')    with tf.variable_scope('pooling2_lrn') as scope:    norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001/9.0,             beta=0.75,name='norm2')    pool2 = tf.nn.max_pool(norm2, ksize=[1,2,2,1], strides=[1,1,1,1],                padding='SAME',name='pooling2')  with tf.variable_scope('local3') as scope:    reshape = tf.reshape(pool2, shape=[batch_size, -1])    dim = reshape.get_shape()[1].value    weights = tf.get_variable('weights',                 shape=[dim,4096],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))    biases = tf.get_variable('biases',                 shape=[4096],                 dtype=tf.float32,                  initializer=tf.constant_initializer(0.1))    local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)   with tf.variable_scope('softmax_linear') as scope:    weights = tf.get_variable('softmax_linear',                 shape=[4096, 2],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))    biases = tf.get_variable('biases',                  shape=[2],                 dtype=tf.float32,                  initializer=tf.constant_initializer(0.1))    softmax_linear = tf.add(tf.matmul(local3, weights), biases, name='softmax_linear')  return softmax_linear

Test the trained Model

First, obtain a test image.

 def get_one_image(img_dir):   image = Image.open(img_dir)   plt.imshow(image)   image = image.resize([32, 32])   image_arr = np.array(image)   return image_arr

Load Model and calculate test results

def test(test_file):  log_dir = 'C:/Users/wk/Desktop/bky/log/'  image_arr = get_one_image(test_file)    with tf.Graph().as_default():    image = tf.cast(image_arr, tf.float32)    image = tf.image.per_image_standardization(image)    image = tf.reshape(image, [1,32, 32, 3])    print(image.shape)    p = model.mmodel(image,1)    logits = tf.nn.softmax(p)    x = tf.placeholder(tf.float32,shape = [32,32,3])    saver = tf.train.Saver()    with tf.Session() as sess:      ckpt = tf.train.get_checkpoint_state(log_dir)      if ckpt and ckpt.model_checkpoint_path:        global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]        saver.restore(sess, ckpt.model_checkpoint_path)        print('Loading success)      else:        print('No checkpoint')      prediction = sess.run(logits, feed_dict={x: image_arr})      max_index = np.argmax(prediction)      print(max_index)

The previous step was to standardize the test image as the input image of the network. 15-19 was to load the model file and then input the image into the model.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More