The network accuracy is constant in tensorflow, and the weight initialization nan problem

Last Update:2018-07-26 Source: Internet

Author: User

Tags constant

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently contacted Deep learning, because the project involves some mobile development problems, but also listened to some of the suggestions of friends, finally decided to choose TensorFlow as the study of deep learning platform. These two days according to the Tflearn official website of the Vggnet demo, with TensorFlow realized Vggnet, but in the training set with 17flowers training, we found that no matter how many iterations, accuracy and loss function is always maintained in a relatively constant value, That is, the network does not converge. At first, it is very confusing, after all, according to the official website of the demo, how this situation will occur. The first thing to think about is to hit the median, like this:

For I in range (£):
    batch_xs, Batch_ys = Mnist.train.next_batch (+)
    sess.run (Train_step, Feed_dict={xs: Batch_xs, Ys:batch_ys, keep_prob:0.5})

    print ("Loss:", Sess.run (Cross_entropy,feed_dict={xs:batch_xs, Ys:batch_ YS, keep_prob:0.5}))
    print (cross_entropy)
    if i% = = 0:
        print (Compute_accuracy (
            mnist.test.images, Mnist.test.labels))

Loss is the output value of the mutual entropy loss, but the result shows that its value is Nan. So traced back, and played the value of weights and bias, found the same is Nan. Then I went to google this question, found that in fact, many people have encountered this problem. On the StackOverflow, one of the explanations is this:

Actually, it turned out to be something stupid. I ' m posting this on case anyone else would run into a similar error.

Cross_entropy =-tf.reduce_sum (Y_*tf.log (Y_conv))

is actually a horrible the computing the cross-entropy. In some samples, certain classes could is excluded with certainty after a while, resulting in y_conv=0 for that sample. That's normally not a problem since you ' re not interested in those, but the the-the-the-the-cross_entropy is written there, it Yiel DS 0*log (0) for that particular sample/class. Hence the NaN.

Replacing it with

Cross_entropy =-tf.reduce_sum (Y_*tf.log (Tf.clip_by_value (y_conv,1e-10,1.0)))

Solved all my problems.

So I tried it, and really did not nan, but this friend's explanation, I indefinitely, probably means that some samples through the forward output to the outermost time when the output value becomes 0, so log (0) causes the result to be displayed as Nan. Then, there is another netizen to explain, said clipping method is not very good, because when the reverse propagation when the threshold is reached, will prevent the gradient changes. So a small constant is added directly to the log function, in order not to let the predicted value be 0.

Cross_entropy =-tf.reduce_sum (Y_*tf.log (Y_conv + 1e-10))

Although the problem has been solved, the understanding is not thorough. If a friend has a thorough understanding, it is also troublesome to point out in the comments. Thank you so much.

------------------------------------------------------------------------Split Line---------------------------------------------- ---------------------------
After discovering this problem, the instructor was also very interested, he was curious, just add such a minimum value to allow the network to continue training is really incredible. But ultimately, the reason for this is that the part of the predicted value has a mathematically insignificant value. So the tutor thought that in seeking the mutual entropy loss, it is possible to cause an arithmetic overflow because it is the exponent of E. The problem remains to be solved.

Attach the StackOverflow in the answer: Http://stackoverflow.com/questions/33712178/tensorflow-nan-bug

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More