This blog will introduce a neural network algorithm package in R: Neuralnet, which simulates a set of data, shows how it is used in R, and how it is trained and predicted. Before introducing Neuranet, let's briefly introduce the neural network algorithm .
Artificial neural Network (ANN), or neural network, is a mathematical model or a computational model for simulating the structure and function of biological neural networks. Neural networks are computed by a large number of artificial neuron junctions. In most cases, the artificial neural network can change the internal structure based on the external information, and it is an adaptive system. Modern neural network is a non-linear statistical data modeling tool, which is often used to model the complex relationships between input and output, or to explore the patterns of data.
The artificial neural network simulates human intelligent behavior from the following four aspects:
Physical Structure: artificial neurons simulate the function of a biological neuron
computational simulations: neurons in the human brain have local computing and storage functions that are connected to form a system. In the artificial neural network, there are also a large number of neurons with local processing capability, and the information can be processed massively in parallel.
Storage and Operation: both the human brain and the artificial neural network are capable of memory storage through the connection strength of neurons, and provide strong support for generalization, analogy and generalization.
Training: like the human brain, artificial neural networks will use different training and learning processes to automatically acquire relevant knowledge based on their own structural characteristics.
A neural network is an operational model that consists of a large number of nodes (or "neurons", or "units") and connections. Each node represents a specific output function, called an excitation function. The connection between each of the two nodes represents a weighted value for passing the connection signal, called the weight, which is equivalent to the memory of the artificial neural network. The output of the network depends on the connection mode of the network, the weight value and the excitation function are different. The network itself is usually the approximation of some kind of algorithm or function in nature, and it may be the expression of a logical strategy.
A. Perceptron
The Perceptron is equivalent to a single layer of a neural network, consisting of a linear combination and an original binary threshold: a single-layer perceptron that forms the Ann system:
The perceptron computes a linear combination of these inputs with a real value vector, and outputs 1 if the result is greater than a certain threshold, otherwise the output ‐1.
The Perceptron function can be written as: sign (w*x) can sometimes be added to the bias B, written as "W*x+b"
Learning a perceptron means choosing the right W0,..., wn value. So the candidate hypothesis that the Perceptron learns to consider is that the space H is the set of all possible real-valued weights vectors.
algorithm Training Steps :
1. Define variables and parameters x (input vector), W (weight vector), B (offset), Y (actual output), d (desired output), a (learning rate parameter)
2, initialization, n=0,w=0
3. Enter the training sample and specify its expected output for each training sample: Class A is recorded as 1, Class B is 1
4. Calculate the actual output y=sign (w*x+b)
5. Update weights vector W (n+1) =w (n) +a[d-y (n)]*x (n), 0
6, judgment, if the convergence condition is satisfied, the algorithm ends, otherwise returns 3
Note that the learning rate a for the stability of the weight should not be too large, in order to reflect the error on the weight of the correction should not be too small, in the final analysis, this is an empirical problem .
From the previous narration, the Perceptron is bound to the linear fractal example, and it can't be classified correctly for the non-divided problem. The idea behind the support vector machines we talked about here is very similar, but the way to determine the classification line is different. It can be said that for the linear fractal example, the support vector machine found the "best" of the categorical line, and the single-layer perceptron found a feasible line.
We take the iris data set (IRIS) as an example (intercept the first 10 lines, a total of 150 rows of data):
ID sepal.length sepal.width petal.length petal.width species
1 5.1 3.5 1.4 0.2 Setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 Setosa
7 4.6 3.4 1.4 0.3 Setosa
8 5.0 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
Ten 4.9 3.1 1.5 0.1 setosa
Since the single-layer perceptron is a two classifier, we also divide the iris data into two categories, "Setosa" and "Versicolor" (the latter two categories are considered the 2nd Class), then the data according to the characteristics: petal length and width of the classification.
Run the following code:
#感知器训练代码:
1a<-0.22W<-rep (0,3)3Iris1<-t (As.matrix (iris[,3:4]))4D<-c (Rep (0,50), Rep (1,100))5E<-rep (0,150)6P<-rbind (Rep (1,150), Iris1)7max<-1000008Eps<-rep (0,100000)9i<-0Ten repeat{ Onev<-w%*%p; AY<-ifelse (sign (v) >=0,1, 0); -E<-d-y; -Eps[i+1]<-sum (ABS (e))/Length (e) the if(eps[i+1]<0.01){ - Print("Finish:"); - Print(w); - Break; + } -w<-w+a* (d-y)%*%t (p); +I<-i+1; A if(i>max) { at Print("max time Loop"); - Print(Eps[i]) - Print(y); - Break; - } -}
#绘图代码:
1Plot (Petal.length~petal.width,xlim=c (0,3), Ylim=c (0,8),2data=iris[iris$species=="virginica",])3data1<-iris[iris$species=="versicolor",]4Points (data1$petal.width,data1$petal.length,col=2)5data2<-iris[iris$species=="Setosa",]6Points (data2$petal.width,data2$petal.length,col=3)7X<-seq (0,3,0.01)8y<-x* (-w[2]/w[3])-w[1]/w[3]9Lines (x,y,col=4)
Two. Neural Network algorithm package--neuralnet in R
This study will output the following neural network topology diagram via Neuralnet. We will simulate a very simple set of data to implement input and output, where the output variable is a random number of independent distributions, and the input variable is the square of the output variable. In this experiment, 10 hidden neurons will be trained.
and the input and output, including the neural network prediction data are as follows:
Input Expected output neural Net output 1 1 0.9623402772 4 2 2.0083461217
9 3 2.9958221776 4 4.0009548085 5 5.0028838579
36 6 5.9975810435 7 6.9968278722 8 8.0070028670 81
9 9.0019220736 9.9222007864
The training code is as follows:
1 # Install and import the Neuralnet package (you also need to install grid and mass two dependent packages)2Install.packages ('neuralnet')3Library"neuralnet")4 5 # constructs 50 random numbers that are independently distributed between 0 and 1006 # then save them as a data frame (data.frame)7 8Traininginput <-as.data.frame (runif (min=0, max=100))9Trainingoutput <-sqrt (traininginput)Ten One # the input and output vectors are constructed as a single data through the Cbind function . A # test the neural network with some training data -Trainingdata <-Cbind (traininginput,trainingoutput) -Colnames (Trainingdata) <-C ("Input","Output") the - # train 10 neural networks of hidden neurons -Net.sqrt <-neuralnet (Output~input,trainingdata, hidden=10, threshold=0.01) - Print(NET.SQRT) + - # Map a neural network topology + plot (NET.SQRT) A atTestData <-As.data.frame ((1:10) ^2) -Net.results <-Compute (NET.SQRT, testdata) - - ls (net.results) - - # View Results in Print(Net.results$net.result) - to # make the results more intuitive +Cleanoutput <-Cbind (Testdata,sqrt (testdata), - As.data.frame (net.results$net.result)) theColnames (Cleanoutput) <-C ("Input","expected Output","Neural Net Output") * Print(Cleanoutput)