Essen Source Reference Https://github.com/yajiemiao/eesen, here is a brief talk about the training before and after the core algorithm source implementation.
The variables used in the training of a sentence (similar to multiple sentences in parallel)
variables |
meaning |
Phones_num |
Number of output nodes in the last layer, corresponding to |phones|+1 |
Labels_num |
In a word, the number of the corresponding label extension blank, such as "123" expanded to "B1B2B3B" |
Frames_num |
A sentence corresponds to the total number of frames, corresponding to the time t |
YTK y_k^t |
Last layer output |
ATK A_k^t |
Input of the Softmax layer |
CTC Error
Ctc. Eval (net_out, targets, &obj_diff);
Dimensions of the variables involved:
variables |
Dimension of |
Net_out |
Frames_num*phones_num |
Alpha/beta |
Frames_num*labes_num |
Ctc_error |
Frames_num*phones_num |
Could have used the final formula to find the ATK a_k^t error, the code is divided into two of the solution, may logically reflect the error of the reverse propagation process, but the actual feeling is not necessary. calculate the error about YTK y_k^t
Ctc_err_.computectcerror (Alpha_, Beta_, Net_out, Label_expand_, PZX);
The formula given in reference to [1] calculates the error about UTK u_k^t
Ctc_err_. Mulelements (net_out);
Cuvector<basefloat> row_sum (Num_frames, Ksetzero);
Row_sum. Addcolsummat (1.0, Ctc_err_, 0.0);
Cumatrix<basefloat> net_out_tmp (net_out);
Net_out_tmp. Mulrowsvec (row_sum);
Diff->copyfrommat (CTC_ERR_);
Diff->addmat ( -1.0, net_out_tmp);
Mainly YTK y_k^t to ATK A_k^t derivative, derivation of reference to the previous blog, the conclusion is as follows:
∂l∂atk=∑k′∂l∂ytk′ytk′δkk′−∑k′∂l∂ytk′yt