Study of CIFAR10 in TensorFlow

Source: Internet
Author: User
Tags scalar

Today learned the next TensorFlow official website on the CIFAR10 section, found some API has not seen before, here to tidy up a bit.
CIFAR10 Tutorial Address

1. The first is the initialization of some parameters

FLAGS = Tf.app.flags.FLAGS

# Basic model parameters.
Tf.app.flags.DEFINE_integer (' batch_size ', +, "" "Number of
                            images to process in a batch." ")
Tf.app.flags.DEFINE_string (' Data_dir ', '/temp/cifar10_data ', ' "" "Cifar10_inputath to the
                           CIFAR-10 data directory." ")
Tf.app.flags.DEFINE_boolean (' use_fp16 ', False, "" "
                            Train the model using FP16.")

TensorFlow API did not find the relevant instructions, checked, found that the tf.app.flags is mainly to parse the command line passed parameters.

Import TensorFlow as tf
FLAGS = Tf.app.flags.FLAGS

tf.app.flags.DEFINE_string (' param1 ', ' Default ', ' "" "string") "")
Tf.app.flags.DEFINE_bool ("param2", False, "" "" bool "" ")

def Main (_):
    print flags.param1
    print Flags.param2

If __name__ = = "__main__":
    Tf.app.run ()

If you run this file Python test.py, the output is the default parameter

Default
False

If you run Python test.py–param1 test–param2 True output is

Test
True

Use of 2.collection containers
There is a function in the code tf.add_to_collection (' losses ', Weight_decay) has not seen before

def _variable_with_weight_decay (name, shape, StdDev, wd): "" "
    Helper to create a initialized variable with weight deca Y.
    A weight decay is added only if one is specified.

    Args:
      name:name of the variable
      shape:list of ints stddev:standard deviation of
      a truncated Gaussian
      WD : Add L2loss weight decay multiplied by this float. If None, weight decay is not the added for this
          Variable.
    Returns:
      Variable Tensor
    "" "
    Dtype = tf.float16 if flags.use_fp16 else tf.float32
    var = _variable_on _CPU (
        name,
        shape,
        tf.truncated_normal_initializer (Stddev=stddev, Dtype=dtype)
    )
    if WD is not None:
        Weight_decay = tf.multiply (Tf.nn.l2_loss (VAR), WD, Name= ' Weight_loss ')
        tf.add_to_collection (' Losses ', Weight_decay)
    return var

Tf.add_to_collection (name, value) refers to encapsulating graph.add_to_collection () with the default diagram, that is, storing the data in collection, whose name is the name in the parameter

Args:name:The key for the collection. For example, the Graphkeys class contains many standard names for collections. Value:the value to add to the collection.

After that, all the stored values can be removed by tf.get_collection (name).

VAR0 = tf. Variable (Tf.constant (1.0))
value = 2 * var0
tf.add_to_collection (' value ', value)
value = 3 * var0
Tf.add_to_collection (' value ', value)

sess = tf. Session ()
init = Tf.global_variables_initializer ()
sess.run (init)

test = tf.get_collection ("value")
print Sess.run (test)

The result of this output is

[2.0, 3.0]

3. Sliding average
Some training algorithms, such as gradient descent, momentum method can achieve better results by sliding average in optimization.
Tf.train.ExponentialMovingAverage is available in TensorFlow.

Shadow_variable-= (1-decay) * (shadow_variable-variable)

That is

shadow_variable = decay * shadow_variable + (1-decay) * variable

Decay usually choose 0.999,0.9999 etc.

There are two ways to achieve the evaluation of the sliding average when creating a model with shadow variable. The average () function can return shadow variable use the name of shadow value to load the checkpoint file, using the Average_name () function.

This is where the average value is stored directly.

def _add_loss_summaries (Total_loss): "" "
    add summaries for losses in CIFAR-10 model.
    Args:
      total_loss:total loss from loss ().
    Returns:
      Loss_average_op:op for generating moving averages of losses.
    "" " #Compute The moving average of all individual losses and the total loss.
    Loss_averages = Tf.train.ExponentialMovingAverage (0.9, name= ' avg ')
    losses = Tf.get_collection (' losses ')
    Loss_averages_op = loss_averages.apply (losses + [Total_loss])

    # Attach A scalar summary to all individual losses and The total loss;
    # do the same for the averaged version of the losses.
    For L in losses + [Total_loss]:
        tf.summary.scalar (L.op.name + ' (raw ', L)
        tf.summary.scalar (L.op.name, Loss_ Averages.average (L))
    return Loss_averages_op

4. Control dependencies
The code also has a function tf.control_dependencies (control_inputs) not seen. This function
Wrapping the Tf.control_dependencies () with the default diagram means first executing the object in the Control_inputs, and performing the following operations.

With Tf.control_dependencies (LOSS_AVERAGES_OP):
    opt = Tf.train.GradientDescentOptimizer (LR)
    grads = Opt.compute_gradients (Total_loss)

5. Create Global Step
I've been directly creating a global step variable to record the number of steps

Global_step = tf. Variable (0, Trainable=false)

But here, using the function of the Tf.contrib class to create, return and create a global step variable

Tf.contrib.framework.get_or_create_global_step (Graph=none)

6. Create Monitoredsession

Monitoredtrainingsession () is used to create monitoredsession that can be used to create some hooks that are related to checkpoint summary. (Hooks are tools in the training/evaluation model)
Stopatstephook: Request stop after a specific number of steps, i.e. stop after flags.max_steps step,
Nantensorhook: When loss is Nan, monitor loss, stop training
Checkpoint_dir: Specify the path of the storage variable
CONFIG:TF. An instance of Configproto that is used to configure the session. Major development of device type and name, such as "CPU" "GPU"

With Tf.train.MonitoredTrainingSession (
    checkpoint_dir=flags.train_dir,
    Hooks=[tf.train.stopatstephook ( last_step=flags.max_steps),
           tf.train.NanTensorHook (loss),
           _loggerhook ()],
    CONFIG=TF. Configproto (
        log_device_placement=flags.log_device_placement)
    ) as mon_sess: While not
    mon_sess.should _stop ():
        Mon_sess.run (TRAIN_OP)

Because the training is too slow, so my number of steps here max_steps:100000.
After the Evaluate evaluation

1. Read Checkpoint
tf.train.get_checkpoint_state: Returning checkpoint status from the checkpoint file
This piece is related to the preservation of variables.

Class Tf.train.Saver for saving and restoring variables
Checkpoint is a mapping from variable names to tensor.

# Create a saver.
Saver = Tf.train.Saver (... variables ...)
# Launch the graph and train, saving the model every steps.
Sess = tf. Session () for
step in Xrange (1000000):
    sess.run (.. Training_op..)
    If step% = = 0:
        # Append The step number to the checkpoint name:
        saver.save (sess, ' My-model ', global_step=ste P

2. Threads and Queues
Queue queues is an asynchronous computation mechanism in TensorFlow, the queue is also a node in the diagram, the other nodes can change its contents, in addition, the node can also be queued enqueue, and the team dequeue operations.

The session is a multithreaded object, so multiple threads can run OPS in parallel using the same session. The TensorFlow provides two classes of Tf.train.Coodinator and Tf.train.QueueRunner for multi-threaded management, and these two classes must be used simultaneously. The Coordinator class allows multiple threads to stop at the same time, and can report exception Queuerunner classes are used to create threads so that they can be queued for tensor in the same queue.

Create a Coordinator object first, and then create some threads that use this object, which can run in a loop and stop when Should_stop () returns True.
Any thread can stop the calculation, just run Request_stop (), and the other threads will stop when should_stop () returns to true.

The Queuerunner class creates a number of threads that can run the queued operation repeatedly, and these threads can use the same coordinator to control the stop. In addition, a queuerunner automatically shuts down threads when an accident occurs.

3. Assessment
The code for this section is as follows:

def Evaluate (): "" "Eval CIFAR-10 for a number of steps." " With TF. Graph (). As_default () as G:eval_data = Flags.eval_data = = ' Test ' images, labels = cifar10.inputs (eval_data =eval_data) # Build A Graph that computes the logits predictions from the inference model logits = CIFAR1 0.inference (images) Top_k_op = Tf.nn.in_top_k (logits, labels, 1) variable_averages = tf.train.Exponential MovingAverage (Cifar10. Moving_average_decay) Variables_to_restore = Variable_averages.variables_to_restore () Saver = Tf.train.Sav ER (variables_to_restore) summary_op = Tf.summary.merge_all () Summary_writer = Tf.summary.FileWriter (FLAGS . Eval_dir, G) while True:eval_once (Saver, Summary_writer, Top_k_op, Summary_op) if FLAGS. Run_once:break time.sleep (flags.eval_interval_secs) 

The

uses a function tf.nn.in_top_k (predictions, targets, K, name=none)
To determine whether Targes is in the first k predictions.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.