Queues in the TensorFlow

Source: Internet
Author: User

In the previous article, although the results were correct, the result was an error at the end of the run:

_1_input_producer:skipping cancelled enqueue attempt with \ not closed

This is mainly because the main thread has been closed, but the read Data queue thread is still executing the team. This article from the "Understanding of TensorFlow Queue", the article on the TF queue is very detailed, benefit, it is necessary to reprint over. There are some changes to the original words, such as "stuck" instead of "blocking."

This article is about the concept and usage of queue in TensorFlow. In fact, there are only three concepts: queue is the implementation of the TF queue and caching mechanism Queuerunner is the encapsulation of the thread that operates on the queue in TF Coordinator is the tool used in TF to coordinate the running of threads

Although they often appear at the same time, but these three things in the tensorflow inside can be used alone, may wish to separate to see. 1. Queue

Depending on how it is implemented, it is divided into several specific types, such as TF. Fifoqueue the queue tf in row order. Randomshufflequeue the queue tf in random order. Paddingfifoqueue a queue tf with a fixed length batch column. Priorityqueue queue with Priority column ...

These types of queue, in addition to their own nature is not the same, the creation, use of the method is basically the same.

To create an argument for a function:

Tf. Fifoqueue (capacity, dtypes, Shapes=none, Names=none ...)

The queue consists primarily of row (Enqueue) and column (dequeue) two operations. The enqueue operation returns a operation node in the calculation diagram in which the dequeue operation returns a tensor value. Tensor is also just a definition (or "declaration") when it is created, and it needs to be run in session to get real values. The following is an example of using queue alone:

Import TensorFlow as TF
TF. InteractiveSession ()

q = tf. Fifoqueue (2, "float")
init = Q.enqueue_many ([[0,0],))

x = Q.dequeue ()
y = x+1 q_inc
= Q.enqueue ([y])

Init.run ()
q_inc.run ()
q_inc.run ()
q_inc.run ()
x.eval ()  # returns 1
x.eval ()  # Return 2
x.eval ()  # block

Note that if a one-time row exceeds the queue size data, the enqueue operation blocks until data is removed from the queue by another thread. Using the dequeue operation on a queue that has been emptied will also block until new data (from another thread) is written. 2. Queuerunner

The calculation of TensorFlow is mainly in the use of CPU/GPU and memory, while data reading involves disk operation, which is much slower than the former operation. Therefore, it is common to read data using multiple threads and then consume data using one thread. Queuerunner is the thread that manages these read-write queues.

Queuerunner needs to be used with the queue (the name is already destined to be off the queue), but it does not necessarily have to use coordinator. Look at the following example:

Import TensorFlow as tf  
import sys  
q = tf. Fifoqueue ("float")  
counter = tf. Variable (0.0)  #计数器
# Add a increment_op to
the counter = Tf.assign_add (counter, 1.0)
# Add the counter to the queue
Enqueue_op = Q.enqueue (counter)

# Create Queuerunner
# Add data to the queue with multiple threads
# This actually creates 4 threads, two increment counts, two execution team
QR = Tf.train.QueueRunner (q, Enqueue_ops=[increment_op, Enqueue_op] * 2)

# main thread
sess = tf. InteractiveSession ()
Tf.global_variables_initializer () the Run ()
# starts the queue thread
qr.create_threads (Sess, start=true) for
i in range:
    print (Sess.run (Q.dequeue ()))

The process of increasing the count will keep running in the background, the process of executing the team will execute 10 times (because the queue length is only 10), then the main thread begins to consume the data, and when a part of the data consumption is taken, the process of the team will begin to execute again. The final main thread stops after consuming 20 data, but the other threads continue to run and the program does not end. 3. Coordinator

Coordinator is a coordinator object that holds the running state of the thread group, which is not necessarily related to the TensorFlow queue and can be used separately and by Python threads. For example:

Import TensorFlow as TF
import threading, Time

# child thread function
def loop (coord, id):
    t = 0 While not
    Coord.shou Ld_stop ():
        print (ID)
        time.sleep (1)
        T + = 1
        # only Line 1 line calls Request_stop method
        if (T >= 2 and id = 1):
            Coord.request_stop ()

# main thread
coord = Tf.train.Coordinator ()
# Create 10 threads using the python API
threads = [ Threading. Thread (Target=loop, args= (coord, i)) for I (range)]

# Start all threads and wait for the thread to end for
T in Threads:t.start ()
coor D.join (Threads)

By running this program, all the child threads will stop after two cycles, and the main thread will wait until all the child threads have stopped, thus ending the entire program. Thus, as long as any one thread invokes the coordinator Request_stop method, all threads can perceive and stop the current thread through the Should_stop method.

The use of Queuerunner and coordinator, in effect, encapsulates this judgment operation so that any out-of-the-box exception can normally end the entire program, while the main thread can also call the Request_stop method directly to stop all child threads from executing. 4. Together

In TensorFlow, there are two classic modes of queue, which are used in conjunction with Queuerunner and coordinator.

The first is to explicitly create the Queuerunner and then call its Create_threads method to start the thread. For example, the following code:

import TensorFlow as TF # 1000 4-D input vectors, random numbers of 1-10 for each number data = * NP.RANDOM.RANDN (1000, 4) + 1 # 1000 random target values, with a value of 0 or 1 target = np.random.randint (0, 2, size=1000) # Create queue, each item in the queue contains one input data and the corresponding target value queue = tf. Fifoqueue (capacity=50, Dtypes=[tf.float32, Tf.int32], shapes=[[4], []]) # Batch row data (this is a operation) Enqueue_op = Queue.enqueue_many ([data, Target]) # column data (this is a tensor definition) data_sample, label_sample = Queue.dequeue () # Create a queuerunner QR = Tf.train.QueueRunner (queue, [ENQUEUE_OP] * 4) with TF containing 4 threads. Session () as Sess: # Create Coordinator coord = Tf.train.Coordinator () # Start Queuerunner managed thread enqueue_threads = q R.create_threads (Sess, Coord=coord, start=true) # main thread, consuming 100 data for step in range (MB): if Coord.should_sto
    P (): Break data_batch, Label_batch = Sess.run ([Data_sample, Label_sample]) # Main thread calculation complete, stop all data acquisition process Coord.request_stop () coord.join (enqueue_threads) 

The second is to start the thread using the global Start_queue_runners method.

The import TensorFlow as TF

# opens multiple files at the same time, displaying the Create queue, while implicitly creating the Queuerunner
Filename_queue = Tf.train.string_input_ Producer (["Data1.csv", "Data2.csv"])
reader = tf. Textlinereader (Skip_header_lines=1)
# TensorFlow Reader object can directly accept a queue as the input
key, value = Reader.read ( Filename_queue) with

TF. Session () as Sess:
    coord = Tf.train.Coordinator ()
    # Start all queue threads in the calculation diagram
    threads = tf.train.start_queue_runners ( Coord=coord)
    # Main thread, consuming 100 data for
    _ in range (m):
        features, labels = Sess.run ([Data_batch, Label_batch]) c12/># main thread calculation complete, stop all process of data acquisition
    Coord.request_stop ()
    coord.join (threads)

In this example, Tf.train.string_input_produecer adds an implied queuerunner to the global graph (similar operations and Tf.train.shuffle_batch).

Since there is no explicit return to Queuerunner to start the thread with Create_threads, the Tf.train.start_queue_runners method is used to start the TF directly. Graphkeys.queue_runners all queue threads in the collection.

Both of these methods are equivalent in effect.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.