Deep Learning Practice (II.)--Multilayer neural network

Source: Internet
Author: User
First, prepare

In order to understand the neural network more deeply, the author adopts the pure C + + handwritten method, in which the operation of the Matrix is called OpenCV, and the dataset comes from the public dataset A1A.
Experimental environment: Visual Studio 2017 opencv3.2.0 A1A Data set

This article closely follows the previous article depth study practice (i)--logistic regression. Ii. The basis of neural network

The standard neural network structure is shown in the following diagram, in fact, the enhanced version of the logistic regression above (that is, adding a few hidden layers), the basic idea has not changed. For more detailed introduction of the principle, here is recommended Wunda in-depth study series of courses.

The following is a three-layer neural network (pictured above) combined with a A1A dataset, describes the general steps to build: Initialize parameters W1, W2, W3 and B1, B2, B3, because the dimension of the A1A dataset has 123 characteristics, so the Input_layer dimension is (123,m) and M is the sample number , such as the training set is 1065, and the number of intermediate hidden neurons in the three-layer neural network constructed by us is (64,16,1), so the initialization parameter matrix W1 (123,64), W2 (64,16), W3 (16,1) and biased real numbers B1, B2, B3. Multiply W and x (Matrix multiplication, x to upper output, first as sample input), plus bias B (for real numbers), get Z. Activate z, select Activation function Relu in the hidden layer (you can better prevent the gradient explosion, and the result is good), the output layer select sigmoid limit output, their image is as follows: After the forward propagation above, define the loss function, where the cross entropy cost function is used. Reverse propagate, and update parameters.

Forward Propagation Basic formula:
Here the superscript l represents the layer, and superscript I represents the first few samples (corresponding to the A1A dataset, the first few lines), such as a[0] A [0] a^{[0]}, which represents the 0-level input (i.e., the sample input).

Z[1]=w[1]a[0]+b[1] (1) (1) Z [1] = W [1] A [0] + b [1] z^{[1]} = w^{[1]}a^{[0]} +b^{[1]}\tag{1}
A[1]=relu ( Z[1]) (2) (2) A [1] = R e l u (Z [1]) a^{[1]} = Relu (Z^{[1]}) \tag{2}
Z[2]=w[2]a[1]+b[2] (3) (3) Z [2] = W [2] A [1] + b [2] z^{[2]} = w^{[2]}a^{[1]} +b^{[2]}\tag{3}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.