Tensorflow TFRecords file generation and reading methods,

Source: Internet
Author: User
Tags glob

Tensorflow TFRecords file generation and reading methods,

TensorFlow provides the TFRecords format to store data in a unified manner. Theoretically, TFRecords can store any form of data.

Data in the TFRecords file is stored in the format of tf. train. Example Protocol Buffer. The following code defines tf. train. Example.

message Example {   Features features = 1; }; message Features {   map<string, Feature> feature = 1; }; message Feature {   oneof kind {   BytesList bytes_list = 1;   FloatList float_list = 2;   Int64List int64_list = 3; } }; 

The following describes how to generate and read tfrecords files:

First, we will introduce how to generate a tfrecords file and use the following code:

From random import shuffle import numpy as np import glob import tensorflow as tf import cv2 import sys import OS # Because I installed the CPU version, there will be 'warning' during running ', the solution goes to the bottom, and the blind side is ~ OS. environ ['tf _ CPP_MIN_LOG_LEVEL '] = '2' shuffle_data = True image_path ='/path/to/image /*. jpg '# obtain the path of all images under the path, type (addrs) = list addrs = glob. glob (image_path) # detailed analysis of tag data acquisition, type (labels) = list labels =... # Here is the disordered data order. if shuffle_data: c = list (zip (addrs, labels) shuffle (c) addrs, labels = zip (* c) # split the dataset train_addrs = addrs [0: int (0.7 * len (addrs)] train_labels = labels [0: int (0.7 * len (labels)] val_addrs = addrs [int (0.7 * len (addrs): int (0.9 * len (addrs)] val_labels = labels [int (0.7 * len (labels )): int (0.9 * len (labels)] test_addrs = addrs [int (0.9 * len (addrs):] test_labels = labels [int (0.9 * len (labels):] # Didn't I get the image address above? The following function gets the image def load_image (addr) based on the address: # A function to Load image img = cv2.imread (addr) img = cv2.resize (img, (224,224), interpolation = cv2.INTER _ CUBIC) img = cv2.cvtColor (img, cv2.COLOR _ BGR2RGB) # Here/255 is used to normalize the pixel value to [255] img = img. img = img. astype (np. float32) return img # convert data to the corresponding attribute def _ int64_feature (value): return tf. train. feature (int64_list = tf. train. int64List (value = [value]) def _ bytes_feature (value): return tf. train. feature (bytes_list = tf. train. bytesList (value = [value]) def _ float_feature (value): return tf. train. feature (float_list = tf. train. floatList (value = [value]) # write data to the TFRecods file train_filename = '/path/to/train. tfrecords '# output file address # create a writer to write the TFRecords file writer = tf. python_io.TFRecordWriter (train_filename) for I in range (len (train_addrs): # This is the write operation visualization processing if not I % 1000: print ('train data :{}/{}'. format (I, len (train_addrs) sys. stdout. flush () # Load image img = load_image (train_addrs [I]) label = train_labels [I] # create an attribute (feature) feature = {'train/label ': _ int64_feature (label), 'train/image': _ bytes_feature (tf. compat. as_bytes (img. tostring ()} # create a example protocol buffer example = tf. train. example (features = tf. train. features (feature = feature) # Write the preceding example protocol buffer to the file writer. write (example. serializeToString () writer. close () sys. stdout. flush ()

The above section only describes the generation of the train. tfrecords file, and the rest of the validation and test are similar ..

Next we will introduce how to read tfrecords files:

Import tensorflow as tf import numpy as np import matplotlib. pyplot as plt import OS. environ ['tf _ CPP_MIN_LOG_LEVEL '] = '2' data_path = 'train. tfrecords '# tfrecords file address with tf. session () as sess: # define the feature first, which must be consistent with the feature = {'train/image': tf. fixedLenFeature ([], tf. string), 'train/label': tf. fixedLenFeature ([], tf. int64)} # create a queue to maintain the input file list filename_queue = tf. train. string_input_producer ([data_path], num_epochs = 1) # define a reader and read the next record reader = tf. TFRecordReader () _, serialized_example = reader. read (filename_queue) # parse a record features = tf. parse_single_example (serialized_example, features = feature) # resolve the string to the pixel Group image = tf corresponding to the image. decode_raw (features ['train/image'], tf. float32) # convert the label to int32 label = tf. cast (features ['train/label'], tf. int32) # Here, the image is restored to the original dimension image = tf. reshape (image, [224,224, 3]) # You can perform other preprocessing operations .... # Here is the creation of random batches (Baidu) images, labels = tf. train. shuffle_batch ([image, label], batch_size = 10, capacity = 30, min_after_dequeue = 10) # initialize init_op = tf. group (tf. global_variables_initializer (), tf. local_variables_initializer () sess. run (init_op) # Start multithreading to process input data coord = tf. train. coordinator () threads = tf. train. start_queue_runners (coord = coord ).... # Shut down the coord thread. request_stop () coord. join (threads) sess. close ()

Okay. Here we will introduce you... if you have any questions, please leave a message .. Let's learn together .. I hope it will be helpful for everyone's learning, and I hope you can support the house of helping customers more.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.