"Depth Learning Primer -2015mlds" 2. Neural network (Basic Ideas)

Source: Internet
Author: User
Foundation of Neural Network

(Early Warning: This section begins with mathematical notation and the necessary calculus, linear algebra Operations) Overview of this section

As mentioned in the previous lecture, "Learning" is about getting the computer to automatically implement a complex function that completes the mapping from input x to output Y. The basic framework of machine learning is shown in the following illustration.

This section will apply this framework from the perspective of neural networks.

First you define a set of hypothetical functions (function hypothesis set), also called a model, and then train the input data to find a "best" function f∗f^*. Finally, use it to test the test data.

Three issues to solve:
1. What is the model. (What this model looks like)
2. What is the "best" function. (How to define "best")
3. How to pick the "best" function. (How to find this objective function)

The task used in this section is the Classification (classification) problem, where the predictive input is a class of known categories, and the corresponding discrete output variable.

Two classification problems (Binary classification)
Spam filtering (not spam) referral system (this product needs not to be recommended to the user) malware detection (whether the software is malicious) stock forecast (stock is up or down) multiple classification problems (Multi-Class classification)
Handwritten digit recognition (which is what the picture says) image recognition (what is the object in the picture) one. The What is the model (function hypothesis set).

In other words, what the network structure looks like.

For the classification problem, we're going to find this y=f (x) y = f (x) where x is the object to be sorted, Y is the class of the object, and assuming that X and Y are both fixed-length vectors (fixed-size vectors), that is, X⊆rn,y⊆rm X\subseteq r^n, y \subseteq r^m.

Taking handwritten numeral recognition as an example, the binary image can be uniformly adjusted into 16*16 size. Each dimension of the input vector indicates whether each pixel is 0 or 1; the output vector is represented by one-hot, that is, the dimension of the corresponding category is labeled 1, and the remaining category is labeled 0.

Back to the problem we have to solve, if we use only single layers of neurons, then XOR or this simple logical operation is difficult to complete, where the proof process is omitted, so the introduction of a number of hidden layers of neural networks. Neural network as a model

Full Connection Feedforward network

The most commonly used neural network architecture-the fully connected Feedforward network (fully Connected feedforward network) , as shown above, is commonly referred to as the Deep Neural network (Deep Neural, network, because of the number of hidden layers in the crown " Deep ").

Each node of each layer uses the output of all nodes in the previous layer as the input and (weighted sum), after being processed by its own excitation function, produces output to the next layer. mathematical markers of neural networks

The

stipulates that the following tags are used to describe DNN (very important.) ): Number of L-layer neurons in Nl n^l, with the exception of weights, superscript l l represents the level of the neuron, and subscript I I represents the first neuron of the layer. weights between two neurons Wlij W^l_{ij} , subscript ij ij represents the first neuron of the L-level from the J-J Neuron of the l−1 l-1 layer, and the superscript l L represents the level of the latter.
This definition seems strange, but in fact it is ingenious to give the weight matrix Wl \bf w^l and vector x \bf x the meaning of multiplication. The neurons are biased bias bli b^l_i , and all bias of the L-layer neurons are composed of BL \bf b^l neuron input Zli z^l_i . All input components of the L-layer neuron are in the Vector Zl \BF Z^l. Definition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.