Write BP neural network in Java (iv)

Last Update:2014-12-04 Source: Internet

Author: User

Tags mul

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The next chapter.

In (i) and (ii), the system of procedure is Net,propagation,trainer,learner,dataprovider. This refactoring of this system.

Net

First, net, after redefining the activation function and the error function in the previous article, the content is roughly the following:

List<doublematrix> weights = new arraylist<doublematrix> (); list<doublematrix> bs = new arraylist<> (); list<activationfunction> activations = new arraylist<> (); Costfunction Costfunc; Costfunction accuracyfunc;int[] Nodesnum;int layersnum;public compactdoublematrix getCompact () {return new Compactdoublematrix (this.weights,this.bs);}

The function getcompact () generates the corresponding hyper-matrix.

Dataprovider

Dataprovider is the provider of data.

Public interface Dataprovider {    Doublematrix getinput ();    Doublematrix gettarget ();}

If the input is a vector, it also contains a vector dictionary.

Public interface Dictdataprovider extends Dataprovider {public Doublematrix getindexs ();p ublic Doublematrix getdict ();

Each column is a sample. Getindexs () returns the index of the input vector in the dictionary.

I wrote a useful class batchdataproviderfactory to split the sample into Minibatch.

int Batchsize;int Datalen;dataprovider Originalprovider; List<integer> endpositions; list<dataprovider> providers;public batchdataproviderfactory (int batchsize, Dataprovider originalProvider) { Super (); this.batchsize = Batchsize;this.originalprovider = Originalprovider;this.datalen = This.originalProvider.getTarget (). Columns;this.initendpositions (); This.initproviders ();} Public batchdataproviderfactory (Dataprovider originalprovider) {This (4, originalprovider);} Public list<dataprovider> Getproviders () {return providers;}

BatchSize indicates how many batches to divide, Getproviders returns the generated Minibatch, and the raw data being divided is originalprovider.

Propagation

Propagation is responsible for the forward propagation process and the reverse propagation process of the neural network. The interface is defined as follows:

Public interface Propagation {public Propagationresult propagate (Net net,dataprovider provider);}

The propagation function propagate the specified network with the specified data and returns the result of the execution.

Basepropagation implements this interface for simple reverse propagation:

public class Basepropagation implements propagation{//multiple samples. Protected Forwardresult Forward (Net Net,doublematrix input) {forwardresult result = new Forwardresult (); result.input = in Put;doublematrix Currentresult = Input;int index = -1;for (Doublematrix weight:net.weights) {index++;D oublematrix b = NE T.bs.get (index), final activationfunction activation = net.activations.get (index); currentresult = Weight.mmul ( Currentresult). Addcolumnvector (b); Result.netResult.add (currentresult);//multiply derivative doublematrix derivative = Activation.derivativeat (Currentresult); Result.derivativeResult.add (derivative); Currentresult = Activation.valueat (Currentresult); Result.finalResult.add (Currentresult);} result.netresult=null;//no longer needed.    return result;} Multiple sample gradient averages. Protected Backwardresult Backward (Net Net,doublematrix target,forwardresult forwardresult) {Backwardresult result = new Backwardresult ();D Oublematrix output = Forwardresult.getoutput ();D Oublematrix outputderivative = Forwardresult.getoutputderivative (); reSult.cost = net.costFunc.valueAt (output, target);D Oublematrix Outputdelta = net.costFunc.derivativeAt (output, target) . Muli (outputderivative); if (net.accuracyfunc! = null) {result.accuracy=net.accuracyfunc.valueat (output, target);} Result.deltas.add (Outputdelta); for (int i = net.layersnum-1; I >= 0; i--) {Doublematrix Pdelta = Result.deltas.get (re Sult.deltas.size ()-1);//gradient calculation, take all samples average Doublematrix Layerinput = i = = 0? ForwardResult.input:forwardResult.finalResult.get (i-1);D Oublematrix gradient = Pdelta.mmul (Layerinput.transpose () ). Div (target.columns); Result.gradients.add (gradient);//bias Gradient Result.biasGradients.add (Pdelta.rowmeans ());// Calculate the previous layer delta, if the I=0,delta is the input layer error, that is, input adjusts the gradient, does not make the average processing. Doublematrix Delta = net.weights.get (i). Transpose (). Mmul (Pdelta); if (i > 0) delta = Delta.muli ( ForwardResult.derivativeResult.get (i-1)); Result.deltas.add (delta);} Collections.reverse (result.gradients); Collections.reverse (result.biasgradients);//The other delta is not needed. Doublematrix Inputdeltas=result.deltas.get (rEsult.deltas.size ()-1); Result.deltas.clear (); Result.deltas.add (inputdeltas); return result;} @Overridepublic Propagationresult Propagate (net NET, Dataprovider provider) {Forwardresult Forwardresult=this.forward (NET, provider.getinput ()); Backwardresult Backwardresult=this.backward (NET, Provider.gettarget (), forwardresult); Propagationresult result=new Propagationresult (Backwardresult); Result.output=forwardresult.getoutput (); return Result;}

We define the Propagationresult slightly:

public class Propagationresult{doublematrix output;//output result matrix: Outputlen*samplelengthdoublematrix cost;//error Matrix: 1* Samplelengthdoublematrix accuracy;//accuracy Matrix: 1*samplelengthprivate list<doublematrix> gradients;// Weight gradient matrix private list<doublematrix> biasgradients;//biased gradient matrix Doublematrix inputdeltas;//input layer Delta matrix: inputlen* Samplelengthpublic Compactdoublematrix getcompact () {return new Compactdoublematrix (gradients,biasgradients);}}

Another class that implements the interface is minibatchpropagation. He propagates the samples internally in parallel, then synthesizes each minipatch result, using the Batchdataproviderfactory class and the Basepropagation class internally.

Trainer

The trainer interface is defined as:

Public interface Trainer {public    void train (Net net,dataprovider provider);

The simple implementation class is:

public class Commontrainer implements Trainer {int ecophs; Learner learner; Propagation propagation; list<double> costs = new arraylist<> (); list<double> Accuracys = new arraylist<> ();p ublic void Trainone (NET net, Dataprovider provider) { Propagationresult Propresult = this.propagation.propagate (NET, provider); Learner.learn (NET, Propresult, provider);D ouble cost = Propresult.getmeancost ();D ouble accuracy = propresult.getmeanaccuracy (), if (cost! = NULL) Costs.add (cost); if (accuracy! = NULL) accuracys.add (accuracy);} @Overridepublic void train (NET NET, Dataprovider provider) {for (int i = 0; i < this.ecophs; i++) {System.out.println (" Echops: "+i); This.trainone (NET, provider);}}}

The simple iteration echops this, without the intelligent stop function, each iteration learner adjusts the weights.

Learner

Learner adjusts the network weights based on each propagation result, and the interface is defined as follows:

Public interface Learner<n extends Net,p extends dataprovider> {public    void Learn (N Net,propagationresult prop Result,p provider);}

A simple implementation class that adjusts based on the momentum factor-adaptive learning rate is:

public class Momentadaptlearner<n extends Net, P extends Dataprovider>implements learner<n, p> {double moment = 0.7;double LMD = 1.05;double Precost = 0;double eta = 0.01;double Currenteta = eta;double Currentmoment = moment; Compactdoublematrix pregradient;public Momentadaptlearner (double moment, double eta) {super (); this.moment = moment; This.eta = Eta;this.currenteta = Eta;this.currentmoment = moment;} Public Momentadaptlearner () {} @Overridepublic void Learn (N net, Propagationresult Propresult, P provider) {if (This.pregr Adient = = null) init (NET, Propresult, provider);d ouble cost = Propresult.getmeancost (); This.modifyparameter (cost); System.out.println ("Current eta:" + This.currenteta); System.out.println ("Current moment:" + this.currentmoment); This.updategradient (NET, Propresult, provider);} public void Updategradient (N net, Propagationresult Propresult, P provider) {Compactdoublematrix netcompact = this.getnet Compact (NET, propresult,provider); Compactdoublematrix gradcompact = This.getgradientcompact (Net,propresult, provider); gradcompact = Gradcompact.mul (Currenteta * (1-currentmoment)). Addi (Pregradient.mul (currentmoment)); Netcompact.subi (gradcompact); this.pregradient = Gradcompact;} Public Compactdoublematrix getnetcompact (N net,propagationresult Propresult, P provider) {return net.getcompact ();} Public Compactdoublematrix getgradientcompact (N net,propagationresult Propresult, P provider) {return Propresult.getcompact ();} public void Modifyparameter (double cost) {if (This.currenteta >) {This.currenteta = ten;} else if (This.currenteta &L T 0.0001) {This.currenteta = 0.0001;} else if (Cost < This.precost) {This.currenteta *= 1.05;this.currentmoment = moment; } else if (Cost < 1.04 * This.precost) {This.currenteta *= 0.7;this.currentmoment *= 0.7;} else {this.currenteta = ETA; This.currentmoment = 0.1;} This.precost = Cost;} public void init (NET net, Propagationresult Propresult, P provider) {Propagationresult PResult = new Propagationresult (net );p RegrAdient = Presult.getcompact (). DUP ();}}

In the above code, we can see the Compactdoublematrix class on the weight of the package, so that the code is more concise, it is shown in this is a super-matrix, hyper-vector, completely ignoring the internal structure.

At the same time, its subclasses realize the function of updating dictionaries synchronously, the code is very concise, simply append the matrix that needs to be adjusted to the Super matrix, it will be adjusted uniformly in the parent class:

public class Dictmomentlearner Extendsmomentadaptlearner<net, dictdataprovider> {public Dictmomentlearner ( Double moment, double eta) {Super (moment, ETA);} Public Dictmomentlearner () {super ();} @Overridepublic Compactdoublematrix getnetcompact (Net net,propagationresult Propresult, Dictdataprovider provider) { Compactdoublematrix result = super.getnetcompact (NET, propresult,provider); Result.append (Provider.getdict ()); return result;} @Overridepublic Compactdoublematrix getgradientcompact (Net net,propagationresult Propresult, Dictdataprovider Provider) {Compactdoublematrix result = super.getgradientcompact (NET, propresult,provider); Result.append ( Dictutil.getdictgradient (provider, Propresult)); return result;} @Overridepublic void Init (NET NET, Propagationresult Propresult,dictdataprovider provider) {Doublematrix Predictgradient = Doublematrix.zeros (Provider.getdict (). Rows, Provider.getdict (). Columns), Super.init (NET, Propresult, provider); This.preGradient.append (predictgradient);}}

Write BP neural network in Java (iv)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More