From neural network to BP algorithm (pure theory derivation) __ Neural network

Source: Internet
Author: User

The author says: Before having studied once, but after a period of time, many details place already blurred. Recently deduced again, in order to retain as far as possible the derivation idea, specially writes this blog post. On the one hand for their future memories, on the other hand to communicate with you to learn.
For this blog post, the following description:
1. This blog does not guarantee that the derivation process is completely correct, if there is a problem, please correct me.
2. If necessary, welcome to reprint, the only request is please specify the source.

This paper will start with the basic neural network structure, step by step derivation, finally get a neural network using BP algorithm to train the complete process, as well as the middle of the formula used in the derivation. Neural Network

The structure of the neural network is shown in the following figure, which is composed of three parts: input layer, hidden layer (for convenience, the figure gives a layer, in fact, can have multi-layer) and output layer. For each layer, it is made up of several units (neurons). The neurons in the adjacent two layers are all connected, but there is no connection between them in the same layer. Now the parameters are described: x=[(x (1)) T, (x (2)) T,..., (x (m)) T]t x=\left[(x^{(1)}) ^t, (x^{(2)}) ^t,\ldots, (x^{(M)}) ^t\right]^t is the original input dataset. For a single input sample, x (i) =[x (i) 1,x (i) 2,..., x (i) n]t x^{(i)}=\left[x^{(i)}_1,x^{(i)}_2,\ldots,x^{(i)}_n\right]^t, i.e. each sample has n n features that correspond to the number of neurons in the input layer of the neural network. A labeled Training sample set (x (1) is usually required to train the network. Y (1)), (x (2), Y (2)),..., (x (m), Y (m))} \left\{(x^{(1)},y^{(1)}), (x^{(2)},y^{(2)}), \ldots, (x ^{(M)},y^{(M)}) \right\}, the total number of samples is M M. The actual parameters of the network θ= (w,b) \theta= (w,b), W W represents the connection weights between the layers, b b denotes the bias, for example, W (l) IJ

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.