Keras Depth Training 7:constant VAL

Keras Depth Training 7:constant VAL_ACC

Last Update:2018-07-26 Source: Internet

Author: User

Tags constant keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

KERAS:ACC and Val_acc was constant over epochs, was this normal?

Https://stats.stackexchange.com/questions/259418/keras-acc-and-val-acc-are-constant-over-300-epochs-is-this-normal

It seems that your model was not able to make sensible adjustments to your weights. The log loss is decreasing a tiny bit, and then gets stuck. It is just randomly guessing.

I think the root of the problem is so you have sparse positive inputs, positive initial weights and a ReLu activation. I suspect that this combination does not leads to nonzero weight adjustments (however, I does not has any literature Backgro und on this)

There is a few things that you could try:

Change the initialization to normal.
Use sigmoid layers everywhere.
Normalize your input, e.g. use Standardscaler from Scikit learn.
Increase the initial learning rate and/or choose a different optimizer.
For debugging purposes, decrease the size of the hidden layer or even remove it.

Loss Curve Oscillation:

Analysis Reason: 1: The batch_size of training is too small

When the amount of data is large enough to properly reduce batch_size, due to the large amount of data, memory is not enough. But blind reduction can lead to an inability to converge and batch_size=1 for online learning.

Batch selection, the first decision is the descent direction, if the data set is relatively small, you can fully take the form of a full data set. The benefits of doing this are two points,

1) The direction of the whole data set can better represent the sample population and determine its extremum.

2) It is difficult to choose a global learning rate because the gradient values of different weights vary enormously.

The benefits of increasing batchsize are three points:

1) The utilization of memory is improved, and the parallelization efficiency of large matrix multiplication is improved.

2) The number of iterations required to complete the epoch (full data set) is reduced, and the processing speed for the same amount of data is further accelerated.

3) within a certain range, the larger the batchsize, the more accurate its determination of the downward direction, resulting in less training shocks.

The downside of blind enlargement:

1) When the data set is too large, the memory can't hold up.

2) The batchsize increases to a certain degree, and its determined descent direction has basically no longer changed.

Summarize:

1) Batch number is too small, and the category is more, it may lead to loss function oscillation without convergence, especially when your network is more complex.

2) As the batchsize increases, the faster processing of the same amount of data is possible.

3) As the batchsize increases, the number of epochs required to achieve the same accuracy is increasing.

4) Due to the contradiction between the two factors above, the batch_size increases to a certain time, to achieve the optimal timing.

5) The result of too large batchsize is that the network can easily converge to some bad local optimal points. The same too small batch also has some problems, such as training speed is very slow, training is not easy to converge and so on.

6) The specific batch size is selected and the number of samples in the training set is correlated

Analysis Reason: 2: Data input is not correct

1: Data input does not include data format is not the format specified by the network model, resulting in training when the network learning data is not desired; At this time, there will be loss curve oscillation;

Workaround: Check the data input format, the data input path;

Analysis Reason: 3: The path in the training script is configured correctly;

1: When the path of the Train.bin in the script or the path of the model parameter is not configured, the training model result is incorrect.

Workaround: Check that the script is configured correctly.

Tensorboard Error: Tensorboard attempted to bind to Port 6006, but it is already in use
https://blog.csdn.net/weixin_35654926/article/details/75577515

Understanding Tensorboard
https://blog.csdn.net/u010099080/article/details/77426577

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More