TensorFlow and tensorflow

Last Update:2018-02-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview

The newly uploaded mcnn contains complete data read/write examples. For details, refer.

The official website provides three methods for Tensorflow to read data:

Feeding: each step of TensorFlow execution allows Python code to supply data.
Read data from a file: at the beginning of a TensorFlow graph, let an input pipeline read data from the file.
Pre-load data: Define constants or variables in the TensorFlow graph to save all the data (only applicable when the data volume is small ).

For a small amount of data, it is possible to directly load the data into the memory, and then input the network into the batch for training (tip: This method is more concise when combined with yield, let's try it by yourself. I won't go into details ). However, if the data size is large, this method is not applicable. Because it is too memory-consuming, it is best to use the queue provided by tensorflow, that is, the second method to read data from the file. For some specific reads, such as the csv file format, there are descriptions on the official website. Here I will introduce a common and efficient reading method (few on the official website ), TFRecords

If it's too long to look at the source code, please visit my github. Remember to add a star.

TFRecords

TFRecords is actually a binary file. Although it is not as easy to understand as other formats, it can make better use of memory and facilitate copying and moving, and there is no need for a separate Tag file (I will know why later )... ... All in all, this file format has many advantages, so let's use it.

The TFRecords file contains the tf. train. Example protocol memory block (protocol buffer) (the protocol memory block contains the Features field ). We can write a piece of code to get your data, fill in the data into the Example protocol memory block (protocol buffer), serialize the protocol memory block into a string, and use tf. python_io.TFRecordWriter writes data to the TFRecords file.

To read data from the TFRecords file, you can use the tf. parse_single_example parser of tf. TFRecordReader. This operation can resolve the memory block (protocol buffer) of the Example protocol to a tensor.

Next, let's start reading data ~

Generate a TFRecords File

We use tf. train. Example to define the data format we want to fill in, and then use tf. python_io.TFRecordWriter to write data.

Import osimport tensorflow as tf from PIL import Imagecwd = OS. getcwd () ''' the data directory I loaded here is as follows: 0 -- img1.jpg img2.jpg img3.jpg... 1 -- img1.jpg img2.jpg... 2 --... here 0, 1, 2... category, that is, the classes in the following section, is a list defined by my own data type. You can use it flexibly according to your own data situation... '''writer = tf. python_io.TFRecordWriter ("train. tfrecords ") for index, name in enumerate (classes): class_path = cwd + name +"/"for img_name in OS. listdir (class_path): img_path = class_path + img_name img = Image. open (img_path) img = img. resize( (224,224) img_raw = img. tobytes () # convert the image to native bytes example = tf. train. example (features = tf. train. features (feature = {"label": tf. train. feature (int64_list = tf. train. int64List (value = [index]), 'img _ raw': tf. train. feature (bytes_list = tf. train. bytesList (value = [img_raw]) writer. write (example. serializeToString () # serialize to a string writer. close ()

For the definition and details of Example Feature, I recommend you go to the official website to view related APIs.

Basically, an Example contains Features, and Features contains the Feature (not s here) dictionary. Finally, Feature contains a FloatList, ByteList, or Int64List.

In this way, we store the relevant information in a file, so we didn't need to use a separate label file. It is also easy to read.

The following is a simple example of reading small data:

For serialized_example in tf. python_io.tf_record_iterator ("train. tfrecords "): example = tf. train. example () example. parseFromString (serialized_example) image = example. features. feature ['image']. bytes_list.value label = example. features. feature ['label']. int64_list.value # print image, label

Read from queue

Once a TFRecords file is generated, in order to efficiently read data, TF uses the queue to read data.

Def read_and_decode (filename): # generate a queue named filename_queue = tf Based on the file name. train. string_input_producer ([filename]) reader = tf. TFRecordReader () _, serialized_example = reader. read (filename_queue) # returns the file name and file features = tf. parse_single_example (serialized_example, features = {'label': tf. fixedLenFeature ([], tf. int64), 'img _ raw': tf. fixedLenFeature ([], tf. string),}) img = tf. decode_raw (features ['img _ raw'], tf. uint8) img = tf. reshape (img, [224,224, 3]) img = tf. cast (img, tf. float32) * (1. /255)-0.5 label = tf. cast (features ['label'], tf. int32) return img, label

Then we can use it during training.

Img, label = read_and_decode ("train. tfrecords ") # Use shuffle_batch to randomly disrupt the input img_batch, label_batch = tf. train. shuffle_batch ([img, label], batch_size = 30, capacity = 2000, min_after_dequeue = 1000) init = tf. initialize_all_variables () with tf. session () as sess: sess. run (init) threads = tf. train. start_queue_runners (sess = sess) for I in range (3): val, l = sess. run ([img_batch, label_batch]) # We can also process val and l as needed # l = to_categorical (l, 12) print (val. shape, l)

So far, tensorflow's efficient reading of data from a file is almost complete.

Well? Wait... What is it like? By the way, there are several precautions:

First, graph in tensorflow can remember the state, which enables TFRecordReader to remember the position of tfrecord and always return to the next one. This requires that the entire graph must be initialized before use. Here we use the tf. initialize_all_variables () function for initialization.

Second, the queues in tensorflow are similar to normal queues. However, operation and tensor in tensorflow are both symbolic and executed only when sess. run () is called.

Third, TFRecordReader will pop up the name of the file in the queue until the queue is empty.

Summary

Generate a tfrecord File
Define record reader to parse tfrecord files
Construct a batch generator (batcher)
Build other operations
Initialize all operations
Start QueueRunner

Please stamp the sample code to my github. If you think it is helpful, you can add a star.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

TensorFlow and tensorflow

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

TensorFlow and tensorflow

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support