TensorFlow to use, first of all, you have to get the data input. The official documentation describes several of the 1. Read the data from memory to the matrix at once, input directly; 2. The input from the edge of the file is read, and the multithreaded read-write model has been designed; 3. Convert the data in the network or in-memory into the TensorFlow special format Tfrecord, and then read it after the file is stored.
Among them, from the file in the side of the read-side input, the official document examples are in CSV format files. I found a code on the Internet, modified a bit, because his comparison is brief, I will add to the problems encountered
Post Code First
#coding =utf-8import TensorFlow as TF
Import NumPy as NP
Defreadmyfileformat (Filenamequeue):
reader = TF. Textlinereader ()
Key, value = Reader.read (filenamequeue)
Record_defaults = [[1], [1], [1]]
Col1, col2, col3 = tf.decode_csv (value, record_defaults = record_defaults)
features = Tf.pack ([col1, col2])
Label = Col3
Return features, label
Definputpipeline (fileNames = ["1.csv", "2.csv"], batchsize =4, Numepochs = None):
Filenamequeue = Tf.train.string_input_producer (fileNames, num_epochs = Numepochs)
example, label = Readmyfileformat (Filenamequeue)
Min_after_dequeue =8
Capacity = Min_after_dequeue +3 * batchsize
Examplebatch, Labelbatch = Tf.train.shuffle_batch ([example, label], batch_size = batchsize, num_threads = 3, capacity = CA P acity, min_after_dequeue = min_after_dequeue)
Return Examplebatch, Labelbatch
Featurebatch, Labelbatch = Inputpipeline (["1.csv", "2.csv"], batchsize = 4)
With TF. Session () as Sess: # Start populating the filename Queue.coord = Tf.train.Coordinator ()
Threads = Tf.train.start_queue_runners (Coord=coord)
# Retrieve A single instance:try: #while Not Coord.should_stop ():
Whiletrue:
example, label = Sess.run ([Featurebatch, Labelbatch]) Print example
Except Tf.errors.OutOfRangeError:
print ' Done reading '
Finally
Coord.request_stop ()
Coord.join (Threads)
Sess.close ()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Among them, Record_defaults = [[1], [1], [1]], which is used to specify the matrix format and the data type, the matrix in the CSV file, is NXM, then the 1 in 1xm,[1] is used to specify the data type, for example, if there is a decimal in the matrix, the Float,[1 ] should become [1.0].
Col1, col2, col3 = tf.decode_csv (value, record_defaults = record_defaults), there are several columns in the matrix, there are several parameters to write here, such as 5 columns, to write to the COL5, no matter how much you use. Otherwise the error.
Tf.pack ([col1, col2]), as if required col1 and col2 are the same data type, otherwise the error.
My test data
-0.76 15.67-0.12 15.67-0.48 12.52-0.06 12.51 1.33 9.11 0.12 9.1-0.88 20.35-0.18 20.36-0.25 3.99-0.01 3.99-0.87 26. 25-0.23 26.25-1.03 2.87-0.03 2.87-0.51 7.81-0.04 7.81-1.57 14.46-0.23 14.46-0.1 10.02-0.01 10.02-0.56 8.92-0.05 8.92-1.2 4.1-0.05 4.1-0.77 5.15-0.04 5.15-0.88 4.48-0.04 4.48-2.7 10.82-0.3 10.82-1.23 2.4-0.03 2.4-0.77 5.16 -0.04 5.15-0.81 6.15-0.05 6.15-0.6 5.01-0.03 5-1.25 4.75-0.06 4.75-2.53 7.31-0.19 7.3-1.15 16.39-0.19 16.39-1.7 5.19-0.09 5.18-0.62 3.23-0.02 3.22-0.74 17.43-0.13 17.41-0.77 15.41-0.12 15.41 0 47 0 47.01 0.25 3.98 0.01 3.98-1 .1 9.01-0.1 9.01-1.02 3.87-0.04 3.87