NPL STANFORD-4.NPL with DL
@ (NPL) [Read Notes]
NPL STANFORD-4NPL with DL starting from a neuron feedforward computation of single layer neural network Maximum Margin objective Function Reverse propagation backpropagation
1. Start with a neuron
A neuron is the most basic component of a neural network that receives n inputs and produces a single output. The different neurons have different parameters (or weights), but in essence it is still in the calculation, using a particular calculation formula. The computational formula of the neuron (called the activation function)is the most commonly used, for example, the sigmoid function , which receives the n-dimensional vector x x and produces a activition.
Note: W is the same n-dimensional weight vector, b b is the deviation (bias).
A=11+exp (− (wtx+b)) A = \frac{1}{1+exp (-(w^tx+b))}
Diagram
)
2. Single-Layer neural network
Single-layer neural networks are the permutations of multiple neurons, and for each neuron they accept the same input (note: Not all models accept the same input, which is convenient to understand), but may produce completely different outputs. The reason is that each neuron's weight vector is different from the deviation, which can be considered as a consideration of the characteristics of the input vectors.
For convenience, we define the following:
of which Z=wtx+b Z=w^tx+b
Diagram
3. Feedforward calculation
From the front we see that every neuron in a single layer of neural networks has an output, and if an m neuron outputs an m-dimensional vector. But if we need to do a classification, such output is not appropriate, because we want to be a value. So we can use another matrix u∈rmx1 u\in r^{m\times 1} to produce a (nonstandard) value:
S=uta=utf (wtx+b) S=U^TA=U^TF (w^tx+b)
F f is the activation function.
Note: If the input X∈r20 x\in r^{20}, and the number of neurons in one layer is 8, then w∈r8x20 w\in r^{8\times20}, B∈r8 b\in, r^8 u∈r8x1 u\in, r^{8\times 1} s∈r s\in R.
4. Maximum Margin Objective Function
Like most machine learning models, neural networks also need an optimized objective function. Maximum Margin objective is one of the most popular ways, and the idea behind it is simple: ensure that samples of the "true" label calculate a score higher than the "false" label.
For example, the sentence "Museums in Paris are amazing" with the "true" label is labeled S, and the sentence "Not all museums in Paris" with the "false" label is labeled SC S_c. When s