Keras Depth Training 2: Training analysis

Last Update:2018-07-26 Source: Internet

Author: User

Tags json shuffle keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

3. Frequently Asked Questions 3.1 val_loss curve or VAL_ACC curve oscillation is not smooth

The reasons may be as follows: The learning rate may be too large batch size too small sample distribution uneven lack of joining the regularization data smaller 3.2 val_acc almost 0

One important reason is that there is no shuffle when data is split.

Import NumPy as NP
index = Np.arange (data.shape[0])
np.random.seed (1024x768)
np.random.shuffle (index)
Data=data[index]
Labels=labels[index]

3.3 The loss value in the training process is negative.

Cause: The input training data is not normalized
Workaround: Filter The input values through the following function to normalization

#数据归一化  
def data_in_one (inputdata):  
    inputdata = (inputdata-inputdata.min ())/(Inputdata.max ()- Inputdata.min ())

3.4 How to see Loss and ACC changes (loss a few rounds will not change how to do.) )Train loss constantly declining, test loss constantly declining, indicating that the network is still learning; Train loss constantly declining, test loss tend to be unchanged, indicating that the network over-fitting; Train loss tends to be unchanged, test loss constantly declining, indicating that there is a problem with data set 100%; Train loss tends to be unchanged, test loss tends to be constant, indicating that learning bottlenecks, need to reduce the learning rate or batch number; Train loss constantly rising, test loss constantly rising, indicating that the network structure design is improper, training super parameter set improper, data set after cleaning problems. 3.5 The value of loss found in training is Nan, and the possible causes are as follows:The learning rate is too high if it's your own definition of loss function, this could be a problem with your design loss function.

Generally speaking, the higher ACC corresponds to the lower loss, but this is not absolute, after all they are two different things, so in the actual implementation, we can make a fine tuning of both. 3.6 Epoch Wheel number/bn/dropout/

For the Epoch setting problem, we can set the callback function and select the ACC with the highest validation set as the optimal model.

About Bn and dropout, in fact, these two are two completely different things, bn for data distribution, dropout is optimized from the model structure, so they can be used together, for bn it can not only prevent over-fitting, but also to prevent the gradient disappear and other problems, And it can speed up the convergence of the model, but with bn, model training tends to become slower. discussion on over-fitting of 3.7 depth network 3.7.1 joins the dropout layer

CODE schematic:

... From keras.layers import Concatenate,dropout ...
concatenate = CONCATENATE (axis=2) ([Blstm,embedding_layer])

concatenate=dropout (rate=0.1) (concatenate)

3.7.2 Check if the dataset is too small (data augmentation)

The following code is my own experimental data to do the augmentation, can provide you with a reference. First, my dataset is as shown in the figure:

The essays table in my database has 7 columns, each of which behaves as a sample of data, where the first column Authid is the sample number, text is the textual content, followed by the text tag. For the augmentation of text, a reasonable way to amplify data sets is to rotate each sentence of the text so as to ensure the stability of the text as a whole. The following code reads the sample information from the essays table, loops the text and then deposits it into the table_augment table.

CODE schematic:

#!/usr/bin/python #-*-Coding:utf8-*-from sqlalchemy import create_engine # mysql ORM interface,better than mysqld
B Import Pandas as PD import Spacy # a NLP model like nltk,but more industrial. Import JSON to_sql= ' table_augment ' read_sql_table= ' essays ' Def cut_sentences (DF): All_text_name = df["AUTHID"] # ty PE pandas. Series:get all text name (match the "#AUTHID" in Essays) All_text = df["text"] # type Pandas. Series:get all text (match the "text" in essays) all_label_cext=df["Cext"] all_label_cneu=df["Cneu"] All_label    _cagr=df["CAGR"] all_label_ccon=df["Ccon"] all_label_copn=df["COPN"] All_number = all_text_name.index[-1] # from 0 to Len (all_text_name)-1 for I in Xrange (0,all_number+1,1): Print ("Start to deal with text", I, "...
        .") Text = all_text[i] # type Str:one of text in All_text text_name = all_text_name[i] # type str:one of Text_nam E in All_text_name NLP = spacy.load (' en ') test_dOC = NLP (text) #.decode (' UTF8 ')) cut_sentence = [] for sent in test_doc.sents: # Get all line in the T Ext cut_sentence.append (Sent.text) "" "Type sent is Spacy.tokens.span.Span, not a stri Ng, so, we call the member function Span.text to get its Unicode form "" "Line_number = Le N (cut_sentence) for Itertor in range (line_number): if Itertor!=0:cut_sentence=cut_se Ntence[1:]+cut_sentence[:1] Cut_sentence_json = Json.dumps (cut_sentence) input_data_dic = {' Text_n Ame ': str (itertor) + "_" +text_name, ' line_number ': line_number, ' Li  Ne_text ': Cut_sentence_json, ' cext ': all_label_cext[i], ' Cneu ': All_label_cneu[i], ' CAGR ': all_label_cagr[i], ' Ccon ': All_lab
            El_ccon[i],                  ' COPN ': all_label_copn[i]} input_data = PD.
                                                                        DataFrame (input_data_dic,index=[i],columns=[' Text_name ',
                                                                        ' Line_number ', ' Line_text ',
                                                                          ' Cext ',
                                                                          ' Cneu ', ' CAGR ',
                                                                          ' Ccon ', 
        ' COPN ']) Input_data.to_sql (To_sql, engine, if_exists= ' append ', Index=false, chunksize=100) "" "Dataframe.index'll is insert to table by default. We don t want it, so we set the index = False (True default) "" "PRInt ("text", I, "finished") if __name__ = = ' __main__ ': engine = Create_engine (' mysql+pymysql://root:root@localhost:33 06/personality_1?charset=utf8 ', echo=true,convert_unicode=true) df = pd.read_sql_table (read_sql_table, Engine,

 chunksize=5) # Read essays for Df_iter in Df:cut_sentences (Df_iter)

3.7.3 The idea of using a migration study

Specifically, Model.load is trained weight.hdf5, and then continues to train on this basis. You can see the breakpoint training in the post in detail. 3.7.4 Assistant Small tricks.

Small learning rates (learning rate) have been told not to repeat
Increase the batch_size appropriately. I've told you before, not to repeat.
Try another optimizer (optimizer) before you've talked about it.
Keras's callback function earlystopping () has been said before, no more 3.7.5 regularization method

Regularization method means that when the objective function or cost function is optimized, a regular term is added after the objective function or the cost function, usually with L1 regular and L2 regular.

The code snippet illustrates:

From Keras import regularizers ...
out = timedistributed (dense (hidden_dim_2,
                            activation= "Relu",
                            kernel_regularizer=regularizers.l1_l2 ( 0.01,0.01),
                            activity_regularizer=regularizers.l1_l2 (0.01,0.01)
                            )
                      ) (concatenate)

... Dense=dense (
            activation= "Relu",
            kernel_regularizer=regularizers.l1_l2 (0.01,0.01),
            activity_ REGULARIZER=REGULARIZERS.L1_L2 (0.01,0.01)
            ) (dense)

More reference information:
https://blog.csdn.net/mrgiovanni/article/details/52167016

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More