Writing back-propagation neural networks using java (III)
Confucius said, I am in the three provinces of Japan. If we deal with programs, in addition to three provinces a day, we need to save my code three days a day. Check whether the code can be simpler, easier to understand, easier to expand, more common, whether the algorithm can be optimized, and whether the structure can be abstracted. The code is more refined in the process of continuous refactoring. Those who learn how to do it, and those who learn how to do it, are the master of the art. The so-called new Sunday, new Sunday, new Sunday. This reconstructs the code of the previous two articles, mainly restructuring the function interface system and the encapsulation of the weight matrix. A simple function is a mathematical function. A mathematical function generally has an independent variable x (input) and the corresponding value y = f (x) (output ). X can be a number, a vector, a matrix, and so on. The following is a generic definition: blic interface Function <I, O> {O valueAt (I x);} I represents the input type, and O represents the output type. Some functions are microservices, such as activation functions of neural networks. In addition to a function, the microfunction can also find the derivative or gradient at the given x. The gradient type is the same as that of the independent variable. The generic Function is defined as follows: public interface DifferentiableFunction <I, O> extends Function <I, O> {I derivativeAt (I x);} Considering some functions, when obtaining values and derivatives, some intermediate variables are used together, or the last one can use the previous result. We define the PreCaculate interface. When we determine that a function implements the PreCaculate interface, we first call its PreCaculate interface to let it calculate some useful intermediate variables in advance, then, call valueAt and derivativeAt to obtain the specific value, which can save some operation steps. The definition is as follows: public interface PreCaculate <I> {void preCaculate (I x);} based on the above definition, we define the activation function type of the neural network: 1 public interface ActivationFunction extends DifferentiableFunction <DoubleMatrix, DoubleMatrix>: The activation function is a micro function. The input is a matrix (netResult) and the output is a matrix (finalResult ). Some functions with parameters, in addition to independent variables, have some other coefficients or parameters, which are called hyperparameters. For example, the error function, the target value is a parameter, and the output value is an independent variable. This type of function interface is defined as follows: public interface ParamFunction <I, O, P> {O valueAt (I x, P param);} similar, its differential interface is defined as follows: public interface DifferentiableParamFunction <I, O, P> extends ParamFunction <I, O, P> {I derivativeAt (I x, P param);} Our error function is defined as follows: public interface CostFunction extends DifferentiableParamFunction <DoubleMatrix, DoubleMatrix, DoubleMatrix> input, output, and parameters are all matrices. In the concept of neural networks, the composite matrix has a weight matrix between each two layers. The offset Matrix also has a dictionary matrix if the input word vector needs to be adjusted. All these matrices are updated with the iteration process to minimize the number of error functions. Broadly speaking, the training sample is a hyperparameter. All these matrices are independent variables, and the error function is an optimization function. In essence, when adjusting the weight matrix, the independent variables, that is, these matrices, can be expanded and spliced into a super-long vector, and their internal structure is irrelevant. In the source code of jare, these Weight Matrix Values are stored in a long double []. After calculation, then the structure of each matrix is restored from the doulbe. Here, we define a CompactDoubleMatrix class named super matrix to encapsulate these matrix variables from a higher layer, so that it seems to be a matrix. This CompactDoubleMatrix is implemented by maintaining an ordered List <DoubleMatrix> of DoubleMatrix internally, and then performing the addition, subtraction, multiplication, division operation on all matrices in the List in batches. With this encapsulation, we will find that it will simplify a lot of code. Put the complete definition first. Public class CompactDoubleMatrix {List <DoubleMatrix> mats = new ArrayList <DoubleMatrix> (); @ SafeVarargs public CompactDoubleMatrix (List <DoubleMatrix>... matListArray) {super (); this. append (matListArray);} public CompactDoubleMatrix (DoubleMatrix... matArray) {super (); this. append (matArray);} public CompactDoubleMatrix () {super ();} public CompactDoubleMatrix addi (CompactDoubleMatrix other ){ This. assertSize (other); for (int I = 0; I <this. length (); I ++) this. get (I ). addi (other. get (I); return this;} public void subi (CompactDoubleMatrix other) {this. assertSize (other); for (int I = 0; I <this. length (); I ++) this. get (I ). subi (other. get (I);} public CompactDoubleMatrix add (CompactDoubleMatrix other) {this. assertSize (other); CompactDoubleMatrix result = new CompactDoubleMatrix (); (Int I = 0; I <this. length (); I ++) {result. append (this. get (I ). add (other. get (I);} return result;} public CompactDoubleMatrix sub (CompactDoubleMatrix other) {this. assertSize (other); CompactDoubleMatrix result = new CompactDoubleMatrix (); for (int I = 0; I <this. length (); I ++) {result. append (this. get (I ). sub (other. get (I);} return result;} public CompactDoubleMatrix mul (CompactDoubleMatrix Other) {this. assertSize (other); CompactDoubleMatrix result = new CompactDoubleMatrix (); for (int I = 0; I <this. length (); I ++) {result. append (this. get (I ). mul (other. get (I);} return result;} public CompactDoubleMatrix muli (double d) {for (int I = 0; I <this. length (); I ++) {this. get (I ). muli (d);} return this;} public CompactDoubleMatrix mul (double d) {CompactDoubleMatrix result = new Comp ActDoubleMatrix (); for (int I = 0; I <this. length (); I ++) {result. append (this. get (I ). mul (d);} return result;} public CompactDoubleMatrix dup () {CompactDoubleMatrix result = new CompactDoubleMatrix (); for (int I = 0; I <this. length (); I ++) {result. append (this. get (I ). dup ();} return result;} public double dot (CompactDoubleMatrix other) {double sum = 0; for (int I = 0; I <this. length (); I ++) {Sum + = this. get (I ). dot (other. get (I);} return sum;} public double norm () {double sum = 0; for (int I = 0; I <this. length (); I ++) {double subNorm = this. get (I ). norm2 (); sum + = subNorm * subNorm;} return Math. sqrt (sum);} public void assertSize (CompactDoubleMatrix other) {assert (other! = Null & this. length () = other. length (); for (int I = 0; I <this. length (); I ++) {assert (this. get (I ). sameSize (other. get (I) ;}@suppresswarnings ("unchecked") public void append (List <DoubleMatrix>... matListArray) {for (List <DoubleMatrix> list: matListArray) {this. mats. addAll (list) ;}} public void append (DoubleMatrix... matArray) {for (DoubleMatrix mat: matArray) this. mats. add (mat );} Public int length () {return mats. size ();} public DoubleMatrix get (int index) {return this. mats. get (index);} public DoubleMatrix getLast () {return this. mats. get (this. length ()-1) ;}}this section describes the encapsulation of various abstract concepts. The next chapter describes how to simplify our code by using these encapsulation.