RBF Neural Network

Source: Internet
Author: User

This digest from: "Pattern recognition and intelligent computing--matlab technology implementation of the third edition" and "Matlab Neural network 43 Case Analysis"

"Note" The Blue font for your own understanding part

The advantages of radial basis function neural network: Approximation ability, classification ability and learning speed are superior to BP neural network, simple structure, concise training, fast learning convergence speed, can approximate any nonlinear function and overcome local minimum problem. The reason is that its parameter initialization has a certain method, not random initialization.

RBF is a three-layer forward network with a single hidden layer. The first layer is the input layer, which is composed of the signal source nodes. The second layer is the hidden layer, the number of hidden layer nodes depends on the problem described, the transformation function of the neurons in the hidden layer is the non-negative linear function of the radial symmetry and attenuation of the center point, the function is the local response function, and the specific local response reflected in the transformation from its visible layer to hidden layer is different from other networks. The previous forward network transform functions are functions of global response. The third layer is the output layer, which responds to the input mode. The input layer only acts as a transmission signal, between the input layer and the hidden layer can be regarded as the connection weight value of 1 connections, the output layer and the hidden layer of the task done is different, so their learning strategy is also different. The output layer adjusts the linear weights, adopts the linear optimization strategy, and therefore the learning speed is faster, while the hidden layer is the parameter of the activation function (green function, Gaussian function, general take the latter), and adopts the nonlinear optimization strategy, so the learning speed is slow. The understanding of this sentence can be found from the transformation between the layers and layers below.

The basic idea of RBF neural network: Using RBF as the "base" of the hidden element to form the hiding layer space, the hidden layer transforms the input vector and transforms the low-dimensional input data into the high-dimensional space, so that the linear irreducible problem in the low-dimensional space can be divided linearly within the high-dimensional space. In detail, the hidden layer space is formed by the "base" of the RBF, so that the input vectors can be mapped directly (not through the right connection) to the hidden space. When the center point of the RBF is determined, the mapping relationship is determined. The mapping of the hidden layer space to the output space is linear (note that this place distinguishes between linear and nonlinear mappings), that is, the network output is a linear weighting of the output of the unit, and the right here is the network tunable parameter.

The following figure is a radial basis neuron model

The 43 case study shows that the activation function of the radial basis function is the distance between the input vector and the weight vector (note that the weight vector here is not the weight of the hidden layer to the output layer, specifically the radial basis neuron model structure below) | | dist| | As a self-variable. The general expression for the activation function of a radial base network is.

"Pattern recognition and calculation only" describes: Radial basis network transfer function is the distance between the input vector and the threshold vector | | X-CJ | | As a self-variable, where | | X-CJ | | is obtained by the product of the line vector of the input vector and the weighted matrix C. The c here is the central parameter of each neuron in the hidden layer, the size is the number of hidden layer neurons * the number of visible layer cells. Furthermore, the Central parameter C of each hidden neuron corresponds to a width vector d, so that different input information can be reflected to the maximum extent by different hidden layer neurons.


The resulting r is the value of the hidden layer neuron.

As the distance between the weights and the input vectors decreases, the network output is incremented, and when the input vector and the weight vector are consistent, the neuron output is 1. The b in the figure is a threshold that adjusts the sensitivity of the neuron. The generalized recurrent neural network can be established by using radial and linear neurons, and this kind of neural network is suitable for the application of function approximation. Radial basis functions and competing neurons can establish probabilistic neural networks, which are suitable for solving classification problems.

The RBF neural Network Learning algorithm requires three parameters: the center of the base function, the variance (width), and the weighted value of the hidden layer to the output layer.

RBF Neural Network Center selection method:

For the learning algorithm of RBF neural network, the key problem is the rational determination of the central parameters of the hidden layer neurons. A common method is to select directly from a given set of training samples from a central parameter (or its initial value), or by clustering method.

① Direct calculation method (random selection of RBF center)

The center of the hidden layer neuron is randomly selected in the input sample, and the center is fixed. Once the center is fixed, the output of the hidden layer neurons is known, so the connection right of the neural network can be determined by solving the linear equation group. The distribution for sample data is obviously representative.

② self-organizing learning to select RBF Center method

The center of the RBF neural network can vary and determine its location by self-organizing learning. The linear weight of the output layer is determined by supervised learning. This method is the redistribution of neural network resources, through learning, so that the center of the hidden neurons of RBF is located in an important area of the input space. This method mainly uses K-means clustering method to select the center of RBF, which belongs to unsupervised (tutor) learning method.

③ supervised (Mentor) learning to select the RBF Center

Train the sample set to get the network center and other weight parameters that meet the supervision requirements. The common method is gradient descent method.

④ orthogonal least squares method for selecting RBF Center

The idea of orthogonal least square method (orthogoal least square) is derived from the linear regression model. The output of the neural network is actually a linear combination of some response parameters (regression factor) of the hidden layer neuron and the connection weights between the hidden layer and the output layer. The regression vectors on all hidden layer neurons are formed by regression factors. The learning process is mainly the regression vector orthogonal process.

In many practical problems, the center of the hidden layer neuron of the RBF neural network is not the clustering center of some sample points or samples in the training set, which needs to be obtained by means of learning, so that the resulting center can better reflect the information contained in the training set data.

The topological structure of RBF neural network based on Gaussian kernel

The first layer of input layer: composed of the signal source node, only the transfer of data information, the input information does not make any transformation

Second layer hidden layer: The number of nodes depends on the need. The kernel function (function function) of the hidden layer neuron is a Gaussian function, which transforms the spatial mapping of the input information.

The third layer of the output layer responds to the input mode. The function of the output layer neuron is linear function, and the output information of the hidden layer neuron is linearly weighted and then output as the output result of the whole neural network.



The Radial basis network transfer function is the distance between the input vector and the threshold vector | | X-CJ | | As a self-variable. where | | X-CJ | | is obtained by the product of the line vector of the input vector and the weighted matrix C. Radial basis neural network transfer parameters can take many forms. Common ones are:

①gaussian function (Gaussian function)


②reflected sigmoidal function (abnormal S-type function)


③ inverse multiquadric function (inverse distortion correction function)


The more commonly used or Gaussian function, this article uses the Gaussian function:

When the input argument is 0 o'clock, the transfer function gets the maximum value of 1. As the distance between the weights and the input vectors decreases, the network output is incremented. In other words, the radial basis function responds locally to the input signal. When the input signal of the function is x near the central range of the function, the hidden layer node produces a larger output. It can be seen that this kind of network has local approximation ability.


When the input vectors are added to the network input, each neuron in the radial base will output a value that represents the proximity between the input vector and the neuron weight vector. If the input vector is much different from the weight vector, then the radial base output is close to 0, and if the input vector is close to the weight vector, the output of the radial base is close to 1, and the output value is near the second-level weight after the second layer (hidden layer) of the linear neuron. In this process, if the output of only one radial base neuron is 1, and the other neuron outputs are 0 or nearly 0, then the output of the linear neuron corresponds to the value of the second (hidden layer) weight of the neuron with the output of 1.

RBF Network Training:

The goal of the training is to find the final weights of the two tiers CJ, DJ and WJ.

The training process is divided into two steps: The first step is unsupervised learning, training to determine the weights between the input layer and the hidden layer CJ, Dj; the second step is supervised learning to determine the WJ of the weights between the hidden layer and the output layer.

The width vector dj with input vector x, corresponding target output vector y and radial basis function is provided before training.

When the first input sample (l=1,2,..., N) is trained, each parameter is expressed and calculated as follows:

(1) Determine the parameters

① determine the input vector x:

, n is the number of input layer elements

② determine the output vector y and the desired output vector o

, q is the number of output layer units


③ initialization of the connection weights of the implied layer to the output layer

where P is the number of hidden layer elements, Q is the number of output layer units.

The method of initialization of the reference Center gives the method to initialize the weight of the hidden layer to the output layer:


Where mink is the minimum value of all expected outputs from the K output neurons in the training set, the MAXK is the maximum value of all expected outputs in the K output neurons of the training set.

④ initializes the central parameters of each neuron in the hidden layer. The center of different hidden layer neurons should have different values, and the corresponding width of the center can be adjusted, so that different input information features can be reflected by different hidden layer neurons. In practice, an input information is always included within a certain range of values. Without losing generality, the initial value of the central component of each neuron in the hidden layer is changed from small to large, so that the weaker input information produces stronger response near the smaller center. The size of the spacing can be adjusted by the number of hidden layer neurons. The advantage is that we can find a reasonable number of hidden layer neurons by the method of test, and make the initialization of the center as reasonable as possible, and the different input features are more obviously reflected in different centers, which embody the characteristics of Gauss nucleus.

Based on the above four, the initial values of the RBF Neural Network Center parameters are:

(P is the total number of neurons in the hidden layer, j=1,2,..., p)

Mini is the minimum value of all input information in the training set I feature, Maxi is the maximum value of all input information for the first feature of the training set.

⑤ initializes the width vector. The width vector affects the range of the input information of the neuron: the smaller the width, the narrower the shape of the corresponding hidden layer neuron function, the smaller the response of the neuron in the vicinity of the other neuron centers. Calculation method:


DF is the width adjustment coefficient, the value is less than 1, the effect is to make each hidden layer neuron more easily realize the ability to feel the local information, is conducive to improve the local response of the RBF neural network.

(2) Calculating the output value of the hidden layer J neuron ZJ


CJ is the central vector of the hidden layer J neuron, which is made up of the central component of all neurons in the input layer by the hidden layer J neuron, and the DJ is the width vector of the hidden layer J neuron, and the larger the DJ, the greater the influence of the hidden layer on the input vector, and the better the smoothness between the neurons; ||.|| is a European norm.

(3) Calculating output of the output layer neuron



It is the adjustment weight between the k neuron of the output layer and the J neuron of the hidden layer.

(4) iterative calculation of weight parameters

The training method of weighted parameters of RBF neural network is taken as gradient descent method. The center, width, and adjustment weight parameters are derived from the adaptive adjustment to the optimal value through learning, and the iteration is calculated as follows:



The adjustment weights for the K-output neuron and the first-j hidden-layer neurons in the T-iteration calculation.

is the central component of the first J hidden layer neuron for the first I input neuron when the t iteration is computed;

For the width corresponding to the center

η as a learning factor

E is the RBF Neural network evaluation function:


Wherein the olk is the expected output value of the K output neuron in the input sample of the first L, and the YLK is the network output value of the K output neuron at the input sample of the first.

To sum up, the learning algorithm of RBF neural network is given:

① according to (1) determine the parameters of the five steps to initialize the parameters of the neural network, and given the value of η and α and the end of the iterative precision ε value.

② calculates the RMS value of the root-mean-square error of the network output, if rms≤ε, ends the training, otherwise goes to step ③


③ in accordance with (4) weight iteration calculation, the adjustment weights, center and width parameters are iterated.

④ Return to step ②

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.