**************************************
Note: This blog series is for bloggers to learn the "machine learning" course notes from Professor Andrew Ng of Stanford University. Bloggers deeply learned the course, do not summarize is easy to forget, according to the course plus their own to do not understand the problem of the addition of this series of blogs. This blog series includes linear regression, logistic regression, neural network, machine learning application and system design, support vector machine, clustering, dimension, anomaly detection, recommender system and large scale machine learning.
**************************************
Neural networks: Expression
Nonlinear hypothesis
The use of non-linear polynomial items can help us to build a better classification model. Assuming we have a very large number of features, such as more than 100 variables, we would like to use these 100 features to construct a nonlinear polynomial model, and the result will be a very impressive set of features, even if we only use a combination of 22 features (x1x2+x1x3+x1x4+...+x2x3+x2x4+ +X99X100), we will also have close to 5,000 combinations of features. There are too many features that need to be computed for general logistic regression.
There are a number of such problems, such as in computer vision car detection problems, for a car picture, you can easily identify this is a car, but in the computer or camera "eye", this is just a bunch of pixel digital matrix:
So, for the car detection problem, we need a bunch of car pictures and a bunch of non-car pictures as training sets, training a classifier for car detection
If we are using a small picture of 50x50 pixels, and we treat all the pixels as features, there are 2,500 features, and if we want to further combine the 22 feature combination into a polynomial model, there will be about 25002/2 (nearly 3 million) features. The normal logistic regression model cannot deal with so many characteristics effectively, so we need the neural network.
We mimic the neuron structure in the brain to build a simple model-logistic unit:
We have designed neural networks similar to neurons, with the following effects:
Where x1,x2,x3 is the input unit, input units, we enter the raw data into them. A1,a2,a3 are intermediate units that are responsible for processing the data and then presenting it to the next layer. Finally, the output unit is responsible for Computing H (x). The neural network model is a network of many logical units organized at different levels, and the output variables of each layer are the input variables of the next layer. As a 3-layer neural network, the first layer becomes the input (Inputlayer), the last layer is called the output layer, and the middle layer becomes the hidden layer (Hidden Layers). We have added a unit of deviation for each layer (bias unit).
Model represents two
Compared to using loops to encode, the use of vectorization methods makes the calculation easier.
In fact, the neural network is like a logistic regression, but we put the input vector in the logistic regression [x1~x3] into the middle layer [a (2) 1~a (2) 3], we can put A0,A1,A2, A3 are considered more advanced eigenvalues, that is, evolutionary bodies of x0,x1,x2,x3, and they are determined by X, because gradients fall, so a is changed and becomes more and more powerful, so these more advanced eigenvalues are far better than just X-rays, and can predict new data. This is the advantage of neural networks compared to logistic regression and linear regression.
features and intuitive interpretation
Our goal is to use a neural network to achieve the same or operation in a logical algebra, so first introduce the same or operation and XOR operations:
The same or logical and XOR logic is a logical function with only two logical variables. If the logical function F equals 1 when the two logical variables A and B are the same, or F equals 0, this logical relationship is called the same or. Conversely, if two logical variables A and B are different, the logical function f equals 1, otherwise f equals 0, this logical relationship is called XOR.
Correspond it to a non-linear classifier, as shown in:
First, we introduce three basic logical operations (logical operations, logical operations, non-logic operations), and introduce their corresponding neural network implementations, and finally combine these basic logic operations to form the final neural network of the same or logical operation.
The neurons on the left (three weights of -30,20,20 respectively) can be considered as acting as logical and (and); the neurons in the graph (three weights -10,20,20 respectively) can be considered as equivalent to logic or (or); the neurons in the right (two weights are 10,-20) can be considered as acting equivalent to logical non (not):
We can use neurons to combine into more complex neural networks for more complex operations. For example, we want to implement the Xnor function (the input two values must be the same, both 1 or both 0), i.e. xnor= (X1ANDX2) or ((NOTX1) and (NOTX2))
The second layer hidden network A1 and A2 respectively represent a and B and not a and not b,a1 and A2 do one or more logical operations to get the same or logical operation. The corresponding output of the neural network is the same as the truth table of the same or operation.
multi-Class classification problem
When we have more than two classifications (i.e. y=1,2,3 ...), for example we want to train a neural network algorithm to identify passers-by, cars, motorcycles and trucks, we should have 4 values in the output layer. For example, the first value of 1 or 0 is used to predict whether a pedestrian, and the second value is used to determine whether it is a car. The input vector x has three dimensions, two intermediate layers, and the output layer 4 neurons are used to represent the 4 classes, that is, each
A data will appear in the output layer [a B c d]t, and only one of the a,b,c,d is 1, which represents the current class. The following is an example of a possible structure for this neural network:
When an element of the vector is 1, and the other is 0 o'clock, the classification result is a category that corresponds to a 1 element. This differs from the multi-class classification representation in the previous logistic regression, in which output y is a value similar to {1, 2, 3,4}, rather than a vector. So, if you want to train a neural network model for a multi-class classification problem, the training set is:
******************
hao_09
Date: 2015/8/13
Article Address: http://blog.csdn.net/lsh_2013/article/details/47454079
******************
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Machine learning: The expression of neural networks