The use and skill of Keras's earlystopping callbacks _

The use and skill of Keras's earlystopping callbacks __keras

Last Update:2018-08-20 Source: Internet

Author: User

Tags keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is the author uses the earlystopping the experience, many is the author own ponder, welcome everybody discussion advice.
Please refer to the official documentation and source code for the use of specific earlystop. What's earlystopping?

Earlystopping is a callbacks, callbacks is used to specify which specific action to perform at the beginning and end of each epoch. Callbacks have some set up interfaces that can be used directly, such as ' acc ', ' VAL_ACC ', ' loss ' and ' Val_loss ', and so on.
Earlystopping is the callbacks for early stop training. Specifically, it can be achieved when the loss on the training set is not reduced (that is, the reduction is less than a certain threshold) and stops continuing training. Why do you use earlystopping?

The root cause is that continuing training will result in lower accuracy on the test set.
That continued training causes the test accuracy rate to decline the reason to guess is 1. Over fit 2. The learning rate is too large to converge 3. When using regular items, the reduction of loss may not be caused by an increase in accuracy, but because of a reduction in the weight size.

Of course, the use of earlystopping can also accelerate the speed of learning, improve the efficiency of modulation. the use and skill of earlystopping

In general, there is a parameter in the Callbacks,fit function called callbacks in the Model.fit function. Note that there is a list type of data that needs to be entered, so it is usually used only in the case of Earlystopping [Earlystopping ()]

Earlystopping's parameters are monitor: Data interface for monitoring, there are ' acc ', ' VAL_ACC ', ' loss ', ' Val_loss ' and so on. Normally, if there is a validation set, use ' VAL_ACC ' or ' val_loss '. But because the author is using 50 percent cross-validation, there is no single set of validation, so can only be used ' ACC '. Min_delta: Increases or decreases the threshold value, only is larger than this part to count as improvement. The size of this value depends on monitor, and it also reflects how tolerant you are. For example, the author's monitor is ' ACC ' and its range varies between 70% and 90%, so it's not concerned about changes less than 0.01%. Plus observe the jitter in the training process (that is, the first decline and then rise), so the appropriate increase in tolerance, and finally set to 0.003%. Patience: Can tolerate how many epoch inside have no improvement. This setting is actually a tradeoff between jitter and a real drop in accuracy. If the patience is large, the final rate of accuracy is slightly lower than the maximum accuracy that the model can achieve. If the patience set is small, then the model is likely to jitter in the early stage, but also in the whole map search phase stopped, the accuracy is generally very poor. The size of the patience is directly related to the learning rate. In the case of learning rate, the first training several times to observe the jitter of the epoch number, slightly larger than its set patience. In the case of learning rate changes, it is recommended that a slightly smaller than the maximum jitter epoch number. The author before the introduction of earlystopping has been acceptable results, earlystopping is icing on the cake, so patience set a relatively high, set to jitter epoch number of the maximum value. Mode: On ' auto ', ' min ', ', Max ' three possible. If you know whether to go up or down, suggest setting. The author's monitor is ' acc ', so mode= ' Max '.

Min_delta and patience are related to "avoid the model stopping in the dithering process", so the adjustment needs to be coordinated. Usually, the Min_delta is reduced, then the patience can be reduced appropriately, and the min_delta increases, then the patience needs to be extended appropriately and vice versa.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More