[Tfrecord Format data] use Tfrecords to store and read tagged pictures

Source: Internet
Author: User
use Tfrecords to store and read tagged pictures original articles, reproduced please specify the source ~ feel useful, welcome to discuss Mutual learning ~follow Me

Tfrecords is actually a binary file, although it is not as good as other formats to understand, but it can better use the memory, more convenient to copy and move, and do not need a separate label file

The Tfrecords file contains the Tf.train.Example Protocol memory block (protocol buffer) (the Protocol memory block contains the field Features). We can write a piece of code to get your data, fill the data into the example protocol memory block (protocol buffer), serialize the protocol memory block into a string, and pass the Tf.python_io. Tfrecordwriter writes to the Tfrecords file.

You can use TF to read data from the Tfrecords file. Tfrecordreader's tf.parse_single_example parser. This operation resolves the example protocol memory block (protocol buffer) to tensor.

We use Tf.train.Example to define the format of the data we want to fill in, and then use Tf.python_io. Tfrecordwriter to write.

Basically, a example contains a dictionary containing feature (not s) in Features,features. Finally, feature contains a floatlist, or bytelist, or int64list sample code

# Reuse the image from earlier and give it a fake label # re-uses the images before and gives a false label to import TensorFlow as tf image_filename = "./im
Ages/chapter-05-object-recognition-and-classification/working-with-images/test-input-image.jpg "# Get a list of filenames Filename_queue = Tf.train.string_input_producer (tf.train.match_filenames_once (image_filename)) # Generate file name Queue Image_ reader = TF. Wholefilereader () _, image_file = Image_reader.read (filename_queue) # Returns a key-value pair via the reader, where value represents the image = Tf.image.decode_
JPEG (image_file) # The image is decoded by the Tf.image.decode_jpeg decoding function. Sess = tf.
Session () Init_op = Tf.group (Tf.global_variables_initializer (), Tf.local_variables_initializer ()) Sess.run (INIT_OP) Coord = Tf.train.Coordinator () threads = Tf.train.start_queue_runners (Coord=coord, sess=sess) print (' The image is: ', Sess.run (image)) Filename_queue.close (cancel_pending_enqueues=true) coord.request_stop () coord.join (threads) image _label = B ' \x01 ' # assume the label data is in a one-hot representation (00000001) # Suppose that the label is located in a single hot (one-hot) encoded representation (00000001) Binary 8-bit ' x01 ' # convert the tensor into bytes, notice that's will load the entire image file # Convert the tensor to byte type, note this loads the entire
Image files.
image_loaded = Sess.run (image) Image_bytes = Image_loaded.tobytes () # Converts the tensor to a byte type. Image_height, image_width, image_channels = image_loaded.shape # export Tfrecord exported Tfrecord writer = Tf.python_io.  Tfrecordwriter ("./output/training-image.tfrecord") # Don ' t store the width, height or image channels in this Example file
To save space and not required.
# The width/height and number of channels of the image are not saved in the sample file to save space that is not required for allocation. Example = Tf.train.Example (Features=tf.train.features (feature={' label ': Tf.train.Feature (bytes_list= Tf.train.BytesList (Value=[image_label]), ' Image ': Tf.train.Feature (Bytes_list=tf.train.byteslist (Value=[image_ Bytes])}) # This would save the example to a text file Tfrecord Writer.write (example. Serializetostring ()) # Serialized As String writer.close () # image = example.features.feature[' image_bytes '].bytes_list.value # label = example.features.feature[' Image_label '].int64_lisT.value # is read in such a way.
The format of the "" "label is called a single-hot code (One-hot encoding) This is a common representation of tagged data for multi-class classification.
The Stanford Dogs DataSet is considered to be a multi-class categorical data because dogs are classified as a single breed rather than a mix of multiple varieties, and in the real world, when predicting dog breeds, multi-label solutions are often more effective because they are able to match dogs belonging to multiple breeds "" "" ""
In this code, the image is loaded into memory and converted to a byte array image_bytes = Image_loaded.tobytes () Values and labels are then loaded into example by using the Tf.train.Example function in the form of value, example = Tf.train.Example (Features=tf.train.features ( feature={' label ': Tf.train.Feature (Bytes_list=tf.train.byteslist (Value=[image_label)), ' image ': tf.train.Feature
(Bytes_list=tf.train.byteslist (Value=[image_bytes])))
Example must be serialized into a binary string by the Serializetostring () method before writing to disk.
Serialization is the conversion of a memory object into a format that can be safely transferred to a file. The above serialized sample is now saved as a format that can be loaded and can be deserialized into the sample format here because the image is saved as a Tfrecord file, it can be loaded again from the Tfrecord file. This saves some time than loading the image and its label separately "" "# Load Tfrecord # Loads the Tfrecord file, gets the file name queue Tf_record_filename_queue = Tf.train.string_input_producer (["./output/ Training-image.tfrecord "]) # Notice The different record reader, this one was designed to work with Tfrecord files which m AY # has more than one example in them. Note this different record reads it, and its design intent is to be able to use a Tfrecord file that may contain multiple samples Tf_record_reader = tf. Tfrecordreader () _, tf_record_serialized = Tf_record_reader.read (tf_record_filename_queue) # Read value value from reader and save as Tf_ Record_serialized # The label and image is stored as bytes but could is stored as int64 or float64 values in a # Seriali Zed TF.
Example Protobuf. # tags and images are stored in bytes, but can also be stored in the serialized TF by int64 or float64 type. Example protobuf File Tf_record_features = tf.parse_single_example (# This is a templated thing, most of it is so written tf_record_serialized, fe atures={' label ': TF. Fixedlenfeature ([], tf.string), ' image ': TF. Fixedlenfeature ([], tf.string),}) "" "Class Fixedlenfeature (Collections.namedtuple (" Fixedlenfeature ", [" shape ",
    "Dtype", "Default_value"])): "" "" "" "Configuration for parsing a fixed-length input feature.
  The configuration used to resolve the fixed-length input characteristics. To treat sparse input as dense, provide a ' default_value ';
Otherwise, the parse functions would fail on any examples missing this feature. The sparse input is considered to be dense, providing a default value, otherwise, the parsing function will be missing the property value of the caseWrong.  Fields:shape:Shape of input data. Type Dtype:data type of input data type Default_value:value to be used if An example are missing this feature.
    It must is compatible with ' Dtype ' and of the specified ' shape '. If an example lacks a property value, the default value is used.
It must be compatible with Dtype and the specified shape.

"" "# but in the actual use of the process, the features is based on the original saved name corresponding to the data type can be selected by itself. # using Tf.uint8 because all of the channels information is between 0-255 # Use the Tf.uint8 type because all the channel information is in the 0~255 range Tf_record_im

Age = Tf.decode_raw (tf_record_features[' image '), tf.uint8) the # Tf.decode_raw () function interprets the byte of a string as a vector of a number. 
    # reshape the image to looks like the image saved, not required # Adjusts the size of the images to resemble the saved images, but this is not required tf_record_image = Tf.reshape ( Tf_record_image, [Image_height, Image_width, Image_channels]) # Use real values for the height, width and channel

S of the image because it's required # used to refer to the height, width, and channel of the image, since the input shape must be adjusted # to reshape the input. Tf_record_label = tf.cast (tf_record_features[' label '), tf.string) sess.close () Sess = tf. InchTeractivesession () Init_op = Tf.group (Tf.global_variables_initializer (), Tf.local_variables_initializer ()) Sess.run (init_op) coord = Tf.train.Coordinator () threads = Tf.train.start_queue_runners (Coord=coord, sess=sess) print ("Equal The image before and now ", Sess.run (tf.equal (image, Tf_record_image)) # Check that the original image and the loaded image are consistent" "First, load the file in the same way as other files,
The main difference is that the file is primarily read by Tfrecordreaader objects. Tf.parse_single_example parses the Tfrecord and then reads "" "Print (" The lable of the image: "in raw bytes (Tf.decode_raw), Sess.run (tf_
Record_label) # Output Image label Tf_record_filename_queue.close (cancel_pending_enqueues=true) coord.request_stop ()
 Coord.join (Threads)
Notice

If you want to reuse this code, please image_filename, Tf.python_io. Tfrecordwriter, Tf.train.string_input_producer, etc. file save parameters change to the location of your own picture.

Test-input-image Image Download Address

The big picture is like this, run please download the small map.

Resources
TensorFlow Practice for Machine intelligence

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.